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ABSTRACT 

One of the main challenges facing upcoming CMB experiments will be to distinguish the cosmologi- 
cal signal from foreground contamination. We present a comprehensive treatment of this problem and 
study how foregrounds degrade the accuracy with which the Boomerang, MAP and Planck experiments 
can measure cosmological parameters. Our foreground model includes not only the normalization, fre- 
quency dependence and scale dependence for each physical component, but also variations in frequency 
dependence across the sky. When estimating how accurately cosmological parameters can be measured, 
we include the important complication that foreground model parameters (we use about 500) must be 
simultaneously measured from the data as well. Our results are quite encouraging: despite all these 
complications, precision measurements of most cosmological parameters are degraded by less than a 
factor of 2 for our main foreground model and by less than a factor of 5 in our most pessimistic scenario. 
Parameters measured though large-angle polarization signals suffer more degradation: up to 5 in the 
main model and 25 in the pessimistic case. The foregrounds that are potentially most damaging and 
therefore most in need of further study are vibrating dust emission and point sources, especially those 
in the radio frequencies. It is well-known that E and B polarization contain valuable information about 
reionization and gravity waves, respectively. However, the cross-correlation between polarized and un- 
polarized foregrounds also deserves further study, as we find that it carries the bulk of the polarization 
information about most other cosmological parameters. 

Subject headings: cosmic microwave background — methods: data analysis 



1. INTRODUCTION 

Our ability to measure cosmological parameters using 
the cosmic microwave background (CMB) will only be 
as good as our understanding of microwave foregrounds, 
e.g., synchrotron, free- free and dust emission from our own 
Galaxy and extragalactic objects. For this reason, the re- 
cent dramatic progress in the CMB field has stimulated 
much work on modeling foregrounds and on algorithms 
for removing them. 

Early work on the subject (Lubin & Smoot 1991; Ben- 
nett et al. 1992, 1994; Brandt et al. 1994; Dodelson & 
Stebbins 1994) focused on the frequency dependence of 
foregrounds and how this could be used to discriminate 
them from CMB. Work done for the Phase A study of 
the Planck satellite mission (Tegmark & Efstathiou 1996, 
hereafter TE96; Bouchet et al. 1996) showed that the scale 
dependence of foregrounds was also important, often be- 
ing quite different from the roughly scale-free CMB fluc- 
tuations, and that a multifrequency version of Wiener fil- 
tering could take advantage of this to improve foreground 
removal. 

The growing interest in CMB polarization, driven by 
the combination of theoretical utility (Kamionkowski et 
al. 1997; Zaldarriaga & Seljak 1997; Hu & White 1997) 
and experimental feasibility (Staggs et al. 1999), has 
spurred the modeling of foreground polarization signals 
(e.g., Keating et al. 1998; Zaldarriaga 1998). Such models 
have been further refined for both the MAP mission (Re- 
fregier et al. 1998) and the final Planck science case (AAO 
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1998), much of which is reviewed in Bouchet & Gispert 
(1999, hereafter BG99) and de Zotti et al. (1999). 

Yet another complication is that the frequency depen- 
dence of foregrounds generally varies slightly across the 
sky. This can be modeled as each foreground having two 
or more subcomponents (TE96; AAO 1998; BG99) or more 
generally by introducing the notion of frequency coherence 
(Tegmark 1998, hereafter T98). 

The purpose of the present paper is to assess the im- 
pact of foregrounds on CMB experiments, including all of 
the above-mentioned complications. This is important for 
two reasons, aside from a general desire to have realistic 
expectations for future CMB experiments: 

1. It helps identify which foregrounds are most damag- 
ing and therefore most in need of further study. 

2. It is useful for optimizing future missions and for 
assessing the science impact of design changes to, 
e.g., Planck. 

Such a comprehensive analysis is quite timely, since our 
knowledge of foregrounds has undergone somewhat of a 
phase transition during the last year or two: whereas 
earlier foreground models were quite speculative, gener- 
ally based on extrapolations from lower or higher frequen- 
cies, foregrounds have now been convincingly detected 
and quantified at CMB frequencies by CMB experiments 
such as COBE DMR (Kogut et al. 1996, hereafter K96), 
MAX (Lim et al. 1996), Saskatoon (de Oliveira-Costa et 
al. 1997), OVRO (Leitch et al. 1997), the 19 GHz survey 
(de Oliveira-Costa et al. 1998) and Tenerife (de Oliveira- 
Costa et al. 1999a). 
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This paper extends prior work in a number of ways. 
The treatment of spectral variations is more general than 
in the work for the Planck proposal (TE96; Bersanelli et 
at 1996; Bouchet et at 1999; BG99) and in Knox (1999, 
hereafter K99). It propagates the effect of foregrounds 
all the way through to the measurement of cosmological 
parameters, which has not been previously done except for 
a limited parameter set (Prunet et al. 1998a). Finally, it 
quantifies the degradation caused by the need to measure 
the statistical properties of the foregrounds directly from 
the CMB data, jointly with the CMB parameters. 

The rest of this paper is organized as follows. In §0, wc 
present models for the various physical foreground com- 
ponents. In §[j| we present our mathematical formal- 
ism for foreground removal. In we apply this to the 
Boomerang, MAP and Planck missions and compute the 
level of foreground residuals in the cleaned map for var- 
ious scenarios. In §|^, we compute the extent to which 
this residual contamination degrades the measurement of 
cosmological parameters, both when the foreground power 
spectra are known and when they must be computed from 
the CMB data itself. In both cases, we study how robust 
our results are to variations in the foreground model. We 
summarize our conclusions in §|^. 

2. FOREGROUND MODELS 1: THE PHYSICS 

The foreground model described in this section is sum- 
marized in Table |^. We make three models: one pes- 
simistic (PESS), one middle-of-the-road (MID) and one 
optimistic (OPT). Since we want to span the entire range 
of uncertainties, we have made both the PESS and OPT 
models rather extreme in the (lamentably many) cases 
where observational constraints are weak or absent. The 
MID model is intended to be fairly realistic, but some- 
what on the conservative (pessimistic) side. A FORTRAN 
code evaluating these models has been made available 
at www.sns.ias.edu/ '^max/ foregrounds.html, and we will 
continually update this as our foreground knowledge im- 
proves. 

2.1. Notation 

Our foreground model involves specifying the following 
quantities for each physical component k and each of the 
four types of polarization power (P = T, E, B and X): 

1. Its average frequency dependence ©/m^)- 



where x = hv/kT cm h w i//56.8GHz. Specific intensity or 
surface brightness is converted to antenna temperature by 



2. Its frequency coherence 



or 

3. Its spatial power spectrum C^ k y 

Although this notation will be described in great detail 
in some clarifications are already in order at this point, 
©(fe) (") gives the frequency dependence of the rms fluctu- 
ations in thermodynamic temperature referenced to the 
CMB blackbody. For reference, antenna temperature is 
converted to thermodynamic temperature by multiplying 

h y 

'2sinh-, 

' = I 1 . (1) 
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We assume that the frequency dependence is indepen- 
dent of polarization type and angular scale. Note that the 
latter is not the same as assuming that the frequency de- 
pendence of the sky brightness does not vary with position 
on the sky. The frequency coherence quantifies this 
spectral variation as described in §|3|. For the purpose of 
this section, it is sufficient to know that £ ss l/\/2Aa, 
where Aa is the dispersion in the foreground spectral in- 
dex across the sky. If we write the foreground specific 
intensity in the form l v — f{v)v a for some shape function 
/, then Aa is simply the rms fluctuation in a. Because our 
foreground models choose O and £ to be independent of 
the polarization type, we will suppress the P superscript 
in this section. We consider the general case in §|^. 

We define Ci in the usual manner, namely as the vari- 
ance of the amplitude of fluctuations in the £ th multipole. 
We then model the power spectra of all components except 
the CMB anisotropies and the thermal Sunyaev-Zel'dovich 
(SZ) effect as power laws 



W(fe) 



{pAft 



(3) 



where [3 and the normalization pA depend on the type 
of foreground (fc) and polarization (P) as shown in Table 
|2| For convenience, we factor the normalization into two 
terms: A gives the normalization of the unpolarized com- 
ponent and p gives the relative normalization of the po- 
larized components. We will explore more general power 
spectrum models in 

2.2. What is a foreground and what is a signal? 

Of the multitude of physical mechanisms that create 
microwave fluctuations in the sky, where should the line 
be drawn between what constitutes a cosmic signal and 
what is to be considered foreground contamination? All 
workers in the field agree that effects occurring around or 
before recombination at z ~ 10 3 constitute signal, whereas 
dust, free- free and synchrotron radiation are foregrounds, 
regardless of whether the origin is in the Milky Way or 
in extragalactic objects. For the remaining effects, the 
distinction is less clear and somewhat arbitrary. It has 
been common to label all effects occurring long after re- 
combination (see Refregier 1999 for a recent review) as 
foregrounds, which would then include, e.g., the late in- 
tegrated Sachs- Wolfe (ISW) effect (Sachs & Wolfe 1967; 
Boughn & Crittenden 1999) and gravitational lensing of 
the CMB. We will take a different and more goal-oriented 
approach. When the goal is to measure cosmological pa- 
rameters, the crucial issue is not when or how the signal 
was created, but how reliably it can be calculated. We 
therefore make the following operational definition of what 
constitutes a foreground: 

• A foreground is an effect whose dependence on cos- 
mological parameters we cannot compute accurately 
from first principles at the present time. 
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Fig. 1. — The MID model for synchrotron radiation (heavy line). The first three columns show the uncleaned amplitude as a function of scale 
at 30, 100 and 217 GHz. The rows show the temperature (T), cross- corr elation (X) and _E-channel polarization, respectively. For reference, 
the CMB power spectrum of our our fiducial ACDM cosmology (§ 5.1) is also shown (thin solid line) together with the total foreground 
power including (dotted) and excluding (dashed) Planck detector noise. The second three columns show the foregrounds amplitude when 
the maps are cleaned according to the optimal procedure in § m this method assumes that the foreground properties are well-known. The 
cleaning depends on the experimental specifications; we show results for Boomerang, MAP and Planck. There is no polarization data in the 
Boomerang column, since this in an unpolarized experiment. 



With this definition, gravitational lcnsing of the CMB, 
the late ISW effect, and the Ostriker-Vishniac (OV) ef- 
fect (Ostriker & Vishniac 1986; Vishniac 1987) are not 
foregrounds, even though the latter is 2 nd order and non- 
Gaussian (Hu et al. 1994; Dodelson & Jubas 1995) and the 
two former jointly create a non-Gaussian bispectrum (Zal- 
darriaga & Seljak 1999; Goldberg & Spergel 1999). On the 
other hand, patchy reionization and the thermal SZ effect 
are foregrounds, since their calculation requires hydrody- 
namics simulations of reionization (reviewed in Haiman & 
Knox 1999) and galaxy formation. 

2.3. Diffuse galactic foregrounds: synchrotron, free-free 
& dust emission 

Our knowledge of Galactic foregrounds improved sub- 
stantially during 1998. Whereas older models {e.g., TE96) 
were mainly based on extrapolations from frequencies far 
outside the CMB range, a number of statistically signif- 
icant detections of cross-correlation between new CMB 
maps and various foreground templates now allow us to 
normalize many foreground signals directly at the frequen- 
cies of interest. 

2.3.1. Synchrotron radiation 

For synchrotron emission in our Galaxy (see Smoot 1999 
for a recent review), we model the frequency dependence 
as ©(synch) M °t c{y)v~ a . Because the spectral index a 
depends on the energy distribution of relativistic electrons 
(Rybicki & Lightman 1979), it may vary somewhat across 
the sky. One also expects a spectral steepening towards 
higher frequencies, corresponding to a softer electron spec- 



trum (Banday & Wolfendale 1991; Fig 5.3 in Jonas 1999). 
Based on the data described in Platania et al. (1998), we 
take a — 2.8 for our MID estimate for the unpolarized 
intensity, with a spectral uncertainty Aa = 0.15. As to 
the power spectrum £~@, the 408 MHz Haslam map sug- 
gests (3 of order 2.5 to 3.0 down to its resolution limit ~ 1° 
(TE96, Bouchet et al. 1996), although the interpretation is 
complicated by striping problems (Finkbeiner et al. 1999). 
The Parkes survey (Duncan 1997; Duncan 1998, hereafter 
D98) enables an extension of this down to 4', i.e., I ~ 900, 
and gives j3 ~ 2.4 (de Oliveira-Costa et al. 1999b) - we 
adopt this value to be conservative, since we will normal- 
ize on large angular scales. This agrees qualitatively with 
theoretical power spectrum estimates assuming isotropic 
turbulence with a fc~ n / 3 Kolmogorov spectrum for the 
Galactic magnetic field (Tchepurnov 1997). 

For the polarized synchrotron component, our observa- 
tional knowledge is unfortunately very incomplete. The 
only available measurement of the polarized synchrotron 
power spectrum is from the 2.4 GHz D98 maps, which ex- 
hibit a much bluer power spectrum in polarization than in 
intensity, with j3 ~ 1.0 instead of 2.5 (de Oliveira-Costa et 
al. 1999b). However, at least part of this patchiness is due 
to modulations in Faraday rotation by small-scale varia- 
tions in the Galactic magnetic field. These results there- 
fore cannot be readily extrapolated to higher frequencies 
such as 50 GHz, where Faraday rotation (which scales as 
v~ 2 ) becomes irrelevant. A second difficulty lies in extrap- 
olating from the D98 observing region around the Galactic 
plane to higher latitudes, where the smaller mean distance 
to visible emission sources may well result in less small- 




scale power in the angular distribution. The polarization 
maps of Brouw & Spoelstra (1976) extend to high Galactic 
latitudes and up to 1.4 GHz but unfortunately are under- 
sampled, making it difficult to draw inferences about the 
polarized power spectrum from them. To bracket the un- 
certainty, we take (3 = 1.0 for PESS, (3 = 1.4 for MID and 
j3 — 3 (the same power spectrum slope as for the unpolar- 
ized intensity) for OPT. 

Although Faraday rotation softens the frequency depen- 
dence to a ~ 1.6 for if < 5 GHz (de Oliveira-Costa et al. 
1999b), we assume that the polarization fraction saturates 
to a constant value for v ^> 10 GHz, as Faraday rotation 
becomes irrelevant. We therefore use the same a and Aa 
for polarized and unpolarized synchrotron radiation. 

For the MID scenario, we normalize the unpolarized syn- 
chrotron component to the cross-correlation with the 19 
GHz map found by de Oliveira-Costa et al. (1998). This 
gives o~ — 52 i 17/iK on the 3° scalef] for a 20° galactic 
cut, retaining roughly the cleanest 65% of the sky. This 
agrees well with the synchrotron amplitude obtained in 
the cross-correlation analyses using the Tenerife 10 and 15 
GHz maps (de Oliveira-Costa et al. 1999a; Jones 1999). 
For the PESS model, we use the 7.1/iK upper limit from 
COBE DMR found by K96) at 31.5 GHz on the 7° scale. 

The degree of synchrotron polarization typically varies 
between 10% and 75% on large scales (Brouw & Spoelstra 
1976), so we normalize our models to give 10% (OPT), 
30% (MID) and 75% (PESS) rms polarization on COBE 
scales. Because the polarization power spectra in the MID 
and PESS models are blue-tilted relative to the intensity 
power spectra, the rms polarization exceeds 100% in these 
models on sub-degree scales. This is physically possible 
because the £ = contribution to the intensity map has 
been ignored; in an extreme case, it is possible to have 
polarization fluctuations even with a perfectly smooth in- 
tensity map. 

2.3.2. Free-free emission 

Of all diffuse Galactic foregrounds, free- free emission is 
the one whose frequency dependence is best known. We 
model it as a power law Q^(i/) cx c(v)v~ a , where a = 

3 For a Gaussian beam with rms width 6, the rms fluctuations a 
are given by 

oo 

CT 2 =Y j e- eH{l+1) C t . (4) 

1=2 

The angular "scale" mentioned here and elsewhere generally refers 
to the full-width-half-maximum (FWHM) beamwidth, given by 
FWHM= v / 8Tn2#. 



2.15 and Aa = 0.02. In our OPT and MID scenarios, 
we assume that this emission is completely unpolarized 
(Rybicki & Lightman 1979). However, free- free emission 
can become polarized by Thomson scattering off of free 
electrons within the Hn region itself (Keating et al. 1998; 
Davies & Wilkinson 1999). We therefore assume a 10% 
polarization level in the PESS model, which corresponds 
to the most extreme case of an optically thick cloud and 
no line-of-sight superpositions of interloper Hn-regions. 

Although the spectrum of free-free emission is well- 
known, the amplitude and power spectrum are not. Since 
dust dominates at high frequencies, synchrotron at low fre- 
quencies and CMB in the intermediate range, it is difficult 
to obtain a spatial template of free-free emission. Ha maps 
should be able to play this role shortly (see McCullough 
et al. 1999 for a review), but in the interim, we must make 
do with more indirect estimates. K96 obtained a 2-sigma 
upper limit of 14.2 /xK for the rms free-free fluctuations at 
53 GHz by taking a linear combination of the three COBE 
DMR maps that projected out the CMB — we use this nor- 
malization for our PESS model, and it is consistent with 
the upper limit of Coble et al. (1999). K96 also found 
a highly significant detection of a component correlated 
with the DIRBE dust maps whose frequency dependence 
was consistent with a = 2.15. Similar correlations have 
been detected for the Saskatoon data (de Oliveira-Costa et 
al. 1997), the 19 GHz map (de Oliveira-Costa et al. 1998) 
and the OVRO Ring experiment (Leitch et al. 1997) — see 
Kogut (1999) for a review of this puzzle. For our MID 
model, we will follow K96 in assuming that this compo- 
nent is in fact free-free emission, which gives an rms of 
7.6 ixK at 53 GHz on DMR scales for a 30° galaxy cut. 
For the power spectrum shape, we assume f3 = 3 for OPT 
and MID (as for dust) and (3 = 2.2 (as for synchrotron 
radiation) for PESS. Again this agrees qualitatively with 
theoretical estimates assuming a isotropic turbulence with 
a Kolmogorov spectrum for electron density fluctuations 
in the interstellar medium (Tchcpurnov 1997). 

2.3.3. Dust 

For vibrational emission from dust grains in the inter- 
stellar medium, we model the frequency dependence as 

e (d u 8 t)M oc cMcM ^^^ . (5) 

We assume a dust temperature T^ust) = 18K (MID) and 
an emissivity a = 1.7 (K96). The effective emissivity 
could vary across the sky if the relative proportions of 
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Fig. 3. — Same as Fig. Ill, but for thermal (vibrational) dust emission. 



different types of dust grains shift, and modulations in the 
dust temperature with, e.g., galactic latitude, would fur- 
ther increase the dispersion in the frequency dependence. 
Estimates of a have ranged between 1.4 and 2.0 across 
the sky and in multi-component models {e.g.. Reach et al. 
1995). Although recent work has weakened the evidence 
for multiple dust temperatures, at least in the cleanest 
parts of the sky (see the discussion in BG99), joint anal- 
ysis of the DIRBE and FIRAS data sets has given strong 
indications that two components with different emissivi- 
ties are present even at high Galactic latitudes (Schlegel 
et al. 1998; Finkbeiner & Schlegel 1999). We therefore we 
take Aa = 0.3 (MID). 

As to the power spectrum £~@, the combined DIRBE 
and IRAS dust maps suggest a slightly shallower slope 
(3 = 2.5 (Schlegel et al. 1998) than earlier work finding 
« 3.0 (Gautier et al. 1992; Low & Cutri 1994; Guarini et 
al. 1995; TE96). However, a recent analysis of the DIRBE 
maps has shown no evidence of a departure from an £~ 3 
power law for £ < 300 (Wright 1998); we will use this value 
for the MID model because only the behavior at low £ is 
important for the present analysis. 

Dust emission may be highly polarized if the grains align 
in the local magnetic field (Wright 1987). For the polar- 
ization power spectra, we use the models of Prunet et al. 
(1998b) and Prunet & Lazarian (1999), which give f3 = 1.3 
for E, (3 — 1.4 for B, and f) = 1.95 for X. This corresponds 
to about 1% polarization in E on the 7° scale and greater 
polarization on smaller scales. 

We normalize the (MID) unpolarized dust power spec- 
trum using the DIRBE-DMR cross-correlation analysis of 
K96, which gives rms fluctuations of 2.9/iK at 53 GHz on 
the COBE angular scale. This is is a factor 2.3 higher 
than the Prunet et al. model at 100 GHz, and we boost 
their polarization normalization by the same factor to be 



conservative. The OPT and PESS normalizations are a 
factor of 3 lower and higher, respectively, for T on the 
7° scale. The E and B normalization is a factor of three 
lower for OPT but a factor 10 higher for PESS, the latter 
corresponding to about 15% polarization on the 5' scale. 

2.3.4. "Anomalous" dust emission 

An alternative interpretation o f the d ust-correlated fore- 
ground component described in § 2.3.2| has been proposed 
by Draine & Lazarian (1998, hereafter DL98). They iden- 
tify it as dust emission after all but radiating via rotational 
rather than vibrational excitations. The latest Tener- 
ife measurements strongly support this idea (de Oliveira- 
Costa et al. 1999) since the observed turnover in the spec- 
trum with a decrease from 15 to 10 GHz is incompatible 
with free- free emission alone. This emission will be dom- 
inated by the very smallest dust grains (more appropri- 
ately called clusters, since they may consist of only ~ 10 2 
atoms) . Many DL98 models are well fit by spectra of the 
form of equation (|J), but with rather unusual parameters. 
For our MID model, we take the rather typical DL98 model 
that is fit by T(d U st) = 0.25K, a = 2.4. However, the range 
of theoretically and observationally allowed spectra is very 
large, and magnetic-dipole dust emission could have yet 
another spectral signature (Draine & Lazarian 1999). We 
adopt a very large spectral uncertainty Aa = 0.5 to reflect 
this. For our pessimistic model, we adopt an extremely 
blue {(3 — 1.2) power spectrum for this component, since 
the work of Leitch et al. (1997) indicates that this compo- 
nent may be very inhomogeneous on small scales. 

We normalize our MID model so that spinning dust ac- 
counts for the entire dust-correlated signal at 31.5 GHz. 
This double-counting is of course mildly conservative, since 
we normalized free-free emission in the same way. Given 
the complete absence of power spectrum measurements for 
this component, the MID model simply assumes the same 



6 



30 GHz 100 GHz 217 GHz Boom MAP Planck 




10 100 1000 10 100 1000 10 100 1000 10 100 1000 10 100 1000 10 100 1000 

l Rot. Dust 

Fig. 4. — Same as Fig. pL but for thermal spinning dust emission. 
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Fig. 5. — Same as Fig. Ill but for the thermal SZ effect from filaments. 



power spectra as for regular dust emission, both in inten- 
sity and polarization, as well as the same polarization frac- 
tions. The PESS scenario gives 10% polarization (Prunet 
& Lazarian 1999). In the OPT scenario, we assume no 
spinning dust component at all. 

Throughout this paper, we are assuming that the differ- 
ent foreground components are uncorrelated. This is prob- 
ably not the case for, e.g., spinning and vibrating dust. 
Once these correlations are better measured, one can take 
advantage of this information to improve the foreground 
removal, as well as define linear combinations of the fore- 
grounds that are uncorrelated. 

2.4. Thermal and kinematic SZ effect 

The thermal SZ effect (Sunyaev & Zel'dovich 1970) is 
the characteristic distortion of the CMB spectrum caused 
by hot ionized gas in galaxy clusters and filaments, whereas 
the kinematic SZ effect is the temperature fluctuation oc- 
curring when motion of such gas Doppler shifts the CMB 
spectrum. The dominant part of the kinematic SZ effect 
caused by matter fluctuations in the linear regime is known 
as the Ostriker-Vishniac (OV) effect (Vishniac 1987), and 



can be accurately computed using perturbation theory (Hu 
& White 1996). According to the definition we gave in §0, 
a process is a foreground only if it cannot be accurately 
computed at the present time, so only part of the kinetic 
SZ effect qualifies as a foreground: the small correction 
to the OV effect caused by nonlinear structures, whose 
computations would require accurate hydrodynamics sim- 
ulations. Since this correction is likely to be small, we will 
not attempt to model it in the present paper. 

The thermal SZ effect, on the other hand, does qual- 
ify as a foreground (Holder & Carlstrom 1999). Just as 
we assumed removal of bright radio and IR point sources, 
we will assume that cores of known clusters have been 
discarded from the CMB maps. In addition to removing 
known clusters, it has been estimated that of order 10 4 
additional clusters can be detected (and removed) using 
the Planck data (de Luca et al. 1995; Aghanim et al. 1997; 
Refregier et al. 1998; Refregier 1999), reducing both the 
kinematic and thermal SZ effect from clusters to negligi- 
ble levels. The SZ foreground will therefore be dominated 
by the thermal effect from filaments and other large-scale 
structures outside of clusters. As our MID estimate of this 
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Table 1 



power spectrum 



CMB Experimental Specifications 



Experiment 


V 


FWHM 


10 6 AT/T 


10 6 AT/T 








(unpol) 


(pol) 


Boomerang 


90 


20 


7.4 






150 


12 


5.7 






240 


12 


10 






400 


12 


80 




MAP 


22 


56 


1.1 


5.9 




30 


41 


5 7 


8 




40 


28 


8 2 


11.6 




60 


21 


11.0 


15.6 




90 


13 


18.3 


25.9 


Planck 


30 


33 


1.6 


2.3 




44 


23 


2.4 


3.4 




70 


14 


3.6 


5.1 




100 


10 


1.3 


6.1 




100 


10.7 


1.7 






143 


8.0 


2.0 


3.7 




217 


5.5 


4.3 


8.9 




353 


5.0 


14.4 






545 


5.0 


147 


208 




857 


5.0 


6670 





NOTES. — Specifications used for Boomerang, MAP and Planck. 
Frequencies v are in GHz. Full width at half maxima (FWHM) of the 
beams are in arcminutes. Boomerang covers a fraction / s k y ~ 2.6% 
of the sky, while we assume a useful sky fraction of 65% for MAP 
and Planck. (io p )- 1 / 2 = AT X FWHM X tt/10800. In practice, we 
combine the two Planck 100 GHz channels into one channel with 
FWHM of 10'. 7 and AT/T of 1.57 and 5.68 X 10~ 6 for unpolarized 
and polarized channels, respectively. 



effect, we use the semianalytic results of Persi et al. (1995), 
whose ACDM model is well fit by the broken power law 
power spectrum 



= (0.26/iK A) 2 







r i7 + 









l/ 7 



(G) 



Here m = 1 and ri2 = — 2 are the asymptotic slopes at low 
and high I respectively, while 7 = —0.25 gives the sharp- 
ness of the peak, which is located at £ pea k = 4000 using 

4 = {-ni/n 2 )~ lhn H 1 p ~2 /n2 ■ Equation (|) is normalized 
in the Rayleigh- Jeans limit v <C 56 GHz for A = 1. Our 
PESS model is normalized an order of magnitude higher, 
roughly in line with current observational upper limits. 
Relativistic corrections to the frequency dependence are 
important for hot clusters (Wright 1979; Rephaeli 1995; 
Stebbins 1997). Since we are throwing out the known clus- 
ters and the filaments that dominate the remaining effect 
are much cooler, the nonrelativistic SZ-spectrum should 
be quite a good approximation. In thermodynamic tem- 
perature, this is given by (Sunyaev & Zel'dovich 1970) 



x x 
©(SZ)O) °c2- T^COth- 



1 as x — > 0, (7) 



where x = his/kT cm i> w i//56.8GHz. 

2.5. Detector noise 

As first pointed out by Knox (1995), detector noise can 
be conveniently treated as an additional sky signal with 



£(noiso) — \ w ) e 



(8) 



if the experimental beam is Gaussian with width 9 in ra- 
dians (the full-width-half-maximum is given by FWHM= 
V81n2 6) . Here the sensitivity measure 1 / w p is defined as 
the noise variance per pixel times the pixel area in steradi- 
ans for P — T, E, B. As shown in Appendix A of Tegmark 
(1997b), equation (||) remains valid even for incomplete 
sky coverage — the corresponding information loss causes 
correlations between the different noise multipoles, but not 
an increase in their variance. The noise variance (AT/T) 2 
per pixel of area FWHM 2 is given in Table |l|. We assume 
that that this pixel noise is equal and uncorrelated for the 
two measured Stokes parameters Q and U , which means 



that the same noise value applies to E and B (w 



We also assume that the noise is uncorrelated between in- 
tensity and polarization, so that l/w x = 0. For an exper- 
iment like MAP where intensity/polarization is measured 
by adding/subtracting pairs of linearly polarized receivers, 
w E = w B = w T /2 (one pair measures Q and T, another 
does U and T, and all four measurements are independent 
with identical variance). 

2.6. Point sources 

The TE96 point-source model assumed that all sources 
above some flux cut S c could be removed from the map 
(by discarding the contaminated pixels, say) and gave the 
power spectrum due to Poisson fluctuations in the unre- 
solved remainder. Here we will make the conservative as- 
sumption that no external source templates will be avail- 
able at these frequencies, so that point sources must be 
detected internally from the CMB maps themselves, say as 

5 — (J outlyers. Especially for high sensitivity experiments 
such as Planck, the main sources of confusion noise are the 
CMB fluctuations themselves (and dust at very high fre- 
quencies). It is therefore desirable to spatially band-pass 
filter the maps to suppress CMB and detector noise fluctu- 
ations before performing the point-source search. Tegmark 

6 de Oliveira-Costa (1998) derive such a procedure and 
find that the resulting minimal rms confusion noise a for 
point-source detection (in MJy) is given by 



o{y) = [c{v)c*{v)] 



-1/2 



(9) 

where C|( tot ) is the sum of the power spectra of other fore- 
grounds, noise and CMB. Tegmark & de Oliveira-Costa 
(1998) find that this filtering lowers the point-source de- 
tection threshold a by a factor between 2.5 and 18 for 
Planck. Refregier et al. (1998) present such an analysis 
for the MAP satellite. 

Once the flux cut S c = 5a has been computed using our 
foreground and CMB model (the latter is described in §[|) , 
we calculate the point-source power spectrum using the 
expression (TE96) 



= {c{v)c*{v]\ 2 
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Fig. 6. — Same as Fig. pL but for radio and far infrared point sources. 



Here n(S,v) gives the source counts, i.e., the number of 
point sources per steradian whose flux exceeds S at the 
frequency v. We evaluate this integral, which is indepen- 
dent of £, separately for each frequency channel using the 
source count model of Toffolatti et al. (1998, 1999; see 
also Guiderdoni et al. 1998, Guiderdoni 1999). We then 
multiply the resulting power spectrum by the normaliza- 
tion fudge factors (pA) 2 given in Table 0. These source 
count models are consistent with the upper limits from 
the SCUBA experiment (Scott & White 1999; Mann et al. 
1999) and other observations (Gawiser et al. 1998). As 
stressed by, e.g., Franceschini et al. (1989), point-source 
clustering can create additional large-scale power. How- 
ever, calculations of this effect (TE96; Toffolatti et al. 
1998; Cress et al. 1996) suggest that it is diluted by angu- 
lar projection down to levels that are negligible compared 
with the Poisson term of equation ( |l0| ) (c./., Scott & White 
1999). The same holds for the effect of weak lensing mod- 
ulation the flux cut (Tegmark & Villumsen 1997). 

This treatment is rather conservative in that it makes no 
assumptions about our ability to model the frequency de- 
pendence of point sources. In other words, it assumes that 
one can remove a source from a map only if one actually 
detects it at that particular frequency. In practice, one 
might opt to discard pixels as contaminated if they con- 
tain a detected point source at other nearby frequencies as 
well, further reducing the residual er ps . Since most point 
sources have a spectrum substantially different from CMB, 
the detection threshold can also be pushed below that of 
equation (Q) by taking linear combinations of band-pass 
filtered versions of different channels, tailored to subtract 
out say the CMB and/or dust signals. 

The frequency dependence of the residual point sources 
has a distinctly bimodal distribution, corresponding to ra- 
dio sources (blazars, etc.) and far infrared sources (early 



dusty galaxies, etc.). Since these are modeled separately 
in Toffolatti et al. (1998), we treat them as two indepen- 
dent components, greatly reducing the effective spectral 
index uncertainty. We take Aa = 0.5 for the radio sources 
in the MID model. If measurements at different frequen- 
cies are not taken simultaneously, time-variability of the 
sources will increase this number (Gutierrez et al. 1999). 
A more detailed model of the frequency coherence of IR 
point sources is given by in Fig E.5 in the HFI report 
(AAO 1998), reprinted as Fig. 5b in BG99), suggesting 
that Aa may be smaller for this population. We therefore 
assume Aa = 0.3 for the IR point sources (MID). 

For the polarization power spectra, we conservatively 
assume that the radio sources are 10% polarized and the 
IR sources are 5% polarized. Point sources are one of 
the few foregrounds whose polarization is not likely to be 
important. This is because the amplitude relative to noise 
is always lower in polarization: detector noise is typically 
"141% polarized" in the sense that it is at least as high in 
the polarization maps as in the intensity maps, usually by 
a factor \/2. 

We conclude this section with some estimates of when 
point sources are important. As shown in Tegmark & Vil- 
lumsen (1997), the rms fluctuation (in /uK) due to residual 
point sources is 



op.wyjj^jv^&w, (ii) 

where N = / n9 2 n(S c ) is the number of sources removed 
per beam area, <7 con f = accjl-nQ 2 is the confusion noise 
of equation (^|) converted from Jy into /xK, and the source 
counts have been approximated by a power law n'(S) cx 
S* -7 near the flux cut. Since relevant values for 7 are typ- 
ically in the range 1.5-2.5 (see references in Tegmark & 
Villumsen 1997), the first term is of order unity. The best 
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Fig. 7. — This figure summarizes the frequency and scale dependence of our foreground models for the optimistic (OPT), middle-of-the-road 
(MID) and pessimistic (PESS) scenarios described in the text. The colored regions show the parts of parameter space where the foreground 
fluctuations exceed a level 5T* characteristic of the CMB, and correspond to synchrotron (magenta), free-free (cyan) and vibrational dust 
emission (red), rotational dust emission (blue) and the thermal SZ effect (yellow). For point sources, the residual is experiment-specific since 
it depends on the flux cut down to which point sources can be detected and excised — it is shown separately for MAP and Planck as thick 
green lines. The black boxes show where detector noise is less than <5T* for MAP and Planck. The thresholds in Qfi a t = (5/12) 1//2 <5T* are 
20/^K, 3/jK and 0.5/xK for unpolarized, cross-polarized and i?-polarizcd fluctuations, respectively. The B-spectra are similar to those shown 
for i?-polarization. 
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Fig. 8. — The frequency dependence of our unpolarized foregrounds is shown for three angular scales {1 = 2,1 = 200 and I = 2000) for the 
optimistic (OPT), middle-of-the-road (MID) and pessimistic (PESS) scenarios described in the text. The thin curves correspond to synchrotron 
radiation (short-dashed), free-free emission (long-dashed), spinning dust (solid), vibrational dust (dotted), point sources (dot-dashed) and SZ 
(solid grey/yellow). The thick curves show the CMB (horizontal line) and the total for all foregrounds. 
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Table 2 

Foreground Model Parameters 
OPTIMISTIC MIDDLE-OF-ROAD PESSIMISTIC 
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NOTES. — Our optimistic (OPT), middle-of-the-road (MID) and pessimistic (PESS) foreground models. -The frequency dependence is normal- 
ized so that &(u t ) =_L. The power spectrum normalization is givenby (pA) 2 , as specified by equation (H) (for free-free, synchrotron and dust 
emission), equation M (for the thermal SZ effect) and equation (hoj) (for point sources). To avoid a profusion of large numbers in the table, 
we have factored the total normalization amplitude pA into an overall constant A and a small dimensionless correction factor p that can be 
interpret polarization percentage (unless the polarized and unpolarized power spectra have different slopes). The label "text" indicates that 
the parameterization is to be found in the text using the given equations. 
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attainable <r con f is typically 3-5 times cr n , the rms detector 
noise per pixel (Tegmark & de Oliveira-Costa 1998). Point 
sources have only a minor impact on a CMB experiment 
if (7p S <C cr n because their power spectra have the same 
shape as that for detector noise (apart from the noise in- 
crease below the beam scale) . Equation ( |ll| ) therefore tells 
us that using the CMB map itself for point-source removal 
is quite adequate as long as N -C (4 x 5)~ 2 = 0.002. Con- 
versely, if there are more sources per beam than this rule 
of thumb indicates, then an external point-source template 
will be needed to reduce the point-source contribution to 
a subdominant level. 

2.7. Foreground model summary 

The specifications of our foreground models are given 
in Table ^. The power spectrum and frequency depen- 
dence is summarized in Figure [?], which follows TE96 in 
showing where the various foregrounds dominate over a 
typical CMB signal. More details about each foreground 
are given in figures 0-^, which show the power spectra at 
three characteristic frequencies. Figure shows the fre- 
quency dependence of the foregrounds on three different 
angular scales. 

3. FOREGROUND MODELS 2: THE MATH 
3.1. Notation 

As described in T98 and further elaborated by White 
(1998), foregrounds can be treated as simply an additional 
source of noise that is correlated between frequency chan- 
nels. This leads to a natural way of parameterizing them 
as well as to a useful way of removing them. Let us first ex- 
press this in its most general mathematical form, and then 
specialize to a case appropriate for our present application 
of accuracy forecasting. 

Consider a pixelized CMB sky map (the "true sky") 
at some angular resolution 9q consisting of M numbers 
xi, ...,xm, where Xi is the temperature in the i th pixel. 
Suppose that we have single-frequency data sets at our 
disposal at F different frequencies v$ (f — 1, 2, F) con- 
sisting of Nf numbers y%, ...,y N , each probing some 
linear combination of the sky temperatures Xi. Grouping 
these numbers into vectors x, y , y 2 , ...,y F of length M, 
Ni, N2 , .. . , Np (these lengths are all generally all differ- 
ent), we can generally write 



ff = A-^x - 



(12) 



for some known Nf x M scan strategy matrices incor- 
porating the beam shapes and some random vectors iv 
incorporating instrumental noise and foreground contami- 
nation. The special case where the F data sets are simply 
sky maps with resolution 9q corresponds to A^ = 1. If 
the data sets are maps with different angular resolutions 



> 9n. then 



2tt(A(9 / ) 2 



2(3717) 



(13) 



cos 



-1 



(?• • Tj) 



IS 



if the beams are Gaussian, where 
the angular separation between pixels in directions and 
Yj and A9f = (# 2 — 2 ) 1 / 2 is the extra smoothing in map 



/. Equation (12) is completely general, however, since 
the scan strategy matrix A-^ can also incorporate compli- 
cations such as elliptical and non-Gaussian beams, triple 
beams, interferometer beams, or oblong synthesized beams 
(e.g. Saskatoon). Of course, the data sets need not be dif- 
ferent channels observed by the same experiment — for in- 
stance, one might wish to use the 408 MHz Haslam survey 
as an additional "channel" . 

It is useful to define the larger Nf) x M matrix and 
the (J^f -/V/ )-dimensional vectors 



(14) 




K y 1 \»» 

This allows us to rewrite equation (|l^) as 
y = Ax + n, 



(15) 



a set of linear equations that would be highly over-determined 
if it were not for the presence of unknown noise n. 

It is straightforward to include polarization information 
in our formalism. In this case, we wish to measure not one 
sky map but three: the unpolarized temperature map x T 
and the "electric" and "magnetic" polarization maps x £ 
and x B (Kamionkowski et al. 1997; Zaldarriaga & Seljak 
1997). The latter are linearly related to the Stokes Q and 
U maps and have the advantage of being independent of 
the choice of coordinate system and more directly linked 
to the physical processes that make the CMB polarized. 
Grouping them into a single vector 



(16) 



and enlarging y, n and A to include polarized measure- 
ments, we once again recover the form of equation (|l5|). 

3.2. Parameter estimation 

The general goal is to use the data set y to measure a 
set of physical parameters. These parameters, which we 
will denote pi (i = 1, N) and group together in a vector 
p, can be either cosmological parameters, such as the true 
CMB sky temperatures x or model inputs like the baryon 
density f^, or constants that parameterize the foreground 
model, such as the emissivity a of thermal dust emission or 
the scale dependence j3 of synchrotron radiation. How ac- 
curately can this be done? If the likelihood of observing y 
given these parameters is written £(y; p), then the answer 
is contained in the Fisher information matrix (Kendall & 
Stuart 1969) 

' d 2 \nC s 



dpidpj 



(17) 



where the partial derivatives and the averaging are eval- 
uated using the true values of the parameters p. The 
Cramer- Rao inequality shows that (F~ 1 )a is the small- 
est variance that any unbiased estimator of the parameter 
Pi can have, and we can generally think of F _1 as the best 
possible covariance matrix for estimates of the vector p 
(see Tegmark et al. 1997 for a review). 
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In §|], we will present a foreground removal method that 
recovers the CMB map x with these minimal error bars 
if the foreground model is known. In f|5| we assess the 
accuracy to which cosmological parameters and foreground 
parameters can be measured jointly. 

For the important case when all fluctuations are Gaus- 
sian with rnean^ (y) = 0, i.e., when the vector y has a 
multivariate Gaussian probability distribution of the form 



£(y;p) = (2 7 r)-"/ 2 |Cr 1 / 2e ^y t c-V i 



(18) 



the model is entirely specified by the covariance matrix 
C = C(p) = (yy*). The Fisher matrix then becomes 



c -i«C c -i»C 

dpi dpj 



(19) 



The covariance matrix C, with contributions from CMB, 
foregrounds and detector noise, is therefore the key quan- 
tity that our model must provide. Modeling C is the topic 
of the next section. 

3.3. Modeling the foreground covariance matrix 

When removing foregrounds from upcoming high-precision 
experiments, it may be desirable to work with C in its full 
generality, explicitly modeling correlations between differ- 
ent foregrounds, correlations between polarized and unpo- 
larized foregrounds, correlations between foreground fluc- 
tuations levels and galactic latitude, etc. Indeed, the fore- 
ground removal method that will be given below in equa- 
tion ( p8[ ) requires no simplifications. However, since the 
goal of the present paper is considerably more modest, we 
will make several simplifying approximations below. 

3.3.1. Transforming to spherical harmonics 

Let us first assume that all of the data sets are maps 
and that the statistical properties of CMB, noise and fore- 
grounds are isotropic^]. This allows us to make the matrix 
C block-diagonal by expanding the data sets y? in spheri- 
cal harmonics. For notational convenience, we renormalize 
the expansion coefficients of yf (the polarization in- 
dex P = T, E, and B) by dividing out the effect of the 

foregrounds typically do not have an expectation value of zero 
- in fact, most of them are always positive. This is one of the rea- 
sons why it can be advantageous to expand the maps in some set of 
basis functions and remove them expansion coefficient by expansion 
coefficient instead of pixel by pixel. For instance, in a Fourier or 
spherical harmonic decomposition, it is typically only a single coeffi- 
cient (the monopole) that will have a non-zero mean. Alternatively, 
one can explicitly deal with the case of a non-zero mean including a 
constraint term (Bond et al. 1999). 

5 Galactic foregrounds such as dust, synchrotron and free- free 
emission are of course not statistically isotropic, since they are more 
prevalent close to the galactic plane. To be conservative, we will 
therefore assume that only the cleanest 65% of the sky is used (for a 
straight latitude cut, this would correspond to discarding all pixels 
less than 20° from the Galactic plane), and assume that the sta- 
tistical properties of the remainder are isotropic with a foreground 
amplitude corresponding to the dirtiest remaining region. 

To take advantage of the fact that the contamination level depends 
both on angular scale and Galactic latitude, it has been suggested 
(Tegmark 1998) that the foreground removal be done not multipole 
by multipole, but wavelet by wavelet, since C will become approxi- 
mately block-diagonal in a suitable spherical wavelet basis even when 
the foreground power depends on latitude. Such wavelet bases are 
described by, e.g., Cayon et al. (1999) and Tenorio et al. (1999). 



beam e e f e ( e + 1 )/ 2 n Then the covariance matrix takes the 
block-diagonal form[] 



PfP'f' _ , pj\ 

^Iml'm' — \ a im 



P'S'\ - 



) — See'Smm'C 



PfP'f 



(20) 



for some size 3F x 3F power spectrum matrix)] Ci of the 
true sky (as opposed to the beam smoothed sky). This 
of course also involves dividing the noise n in equation 
dl2] ) by the same factors, which allows us to recover the 
foregrounds on the true sky while altering the detector 
noise to the form in equation (j^) . 

The Ce matrix can be broken into a block-matrix form 



C f = 





r x 

v« 


















c? 



(21) 



where the Cf (P = T,E,B,X) are F x F matrices to 
specify the correlation between different frequency chan- 
nels for the intensity, i?-channel polarization, B-channel 
polarization, and intensity-polarization cross-correlation, 
respectively. Note that for the CMB and most foregrounds 
cross-correlations between B and either T or E vanish for 
symmetry reasons, B has odd parity whereas T and E have 
even parity (Kamionkowski et al. 1997; Zaldarriaga & Sel- 
jak 1997). This is not necessarily true for all foregrounds, 
so the T — B and B — E correlations may potentially con- 
tain additional useful information about contamination. 
For instance, the effective birefringence caused by Faraday 
rotation though a uniform magnetic field is not invariant 
under parity and gives such "forbidden" cross-correlations 
(Lue et al. 1999). 

In terms of these power spectrum matrices, the Fisher 
matrix of equation (|l9|) reduces to 



5 EC- 



l)/sky tr 



c -id£t c -idCt 
e dpi 1 dpj 



(22) 



where the matrix multiplications involve both polarization- 
type and frequency. Here the factor (2£ + l)/ s ky givpf 
the effective number of uncorrelated modes per multipolea, 
and the other factor gives the information per mode. 

3.3.2. Separation into physical components 

We write n as a sum of detector noise and K physi- 
cally distinct foreground components (synchrotron emis- 
sion, point sources, etc.) and assume that these are all un- 
correlated, both with each other and with the x, the CMB. 
This means that the power spectrum matrix is given by a 
sum 

K+i 

Ci = ^ C«(fc), (23) 

fc=0 

6 When the sky coverage /sky < 1, certain multipoles become 
correlated (Tegmark 1996). This reduces the effective number of 
uncorrelated modes by a factor f~^., thereby increasing the sample 
variance on power measurements by the same factor (Scott et al. 
1994; Knox 1995). It also smears out sharp features in the power 
spectrum, but this effect is negligible as long the sky map is more 
than a few degrees wide in its narrowest direction (Tegmark 1997b). 

7 We use script letters to indicate matrices of size 3F and bold 
letters to indicate matrices and vectors of other sizes, in particular 
size F. 
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where Ciqa is the power spectrum matrix of the k th com- 
ponent, the covariance matrix of its af m at different fre- 
quencies. C^(o) denotes the CMB contribution, C^m the 
detector noise. 

3.3.3. Frequency coherence 

It is convenient to factor these matrices into a spatial 
term, a frequency dependence term, and a frequency cor- 
relation term. We therefore write 



iPff _ r <P aPf&Pf'-oPff 
J l{k) — W<» u O) u (fc) ^(fc) ■ 



(24) 



We normalize the frequency spectrum = 6^,)(i / /) so 



'(fc)V 



that Q/fc)^*) = lj thereby absorbing the physical units 
into CW)j the angular power spectrum of the fc*' 1 com- 
ponent at the reference frequency v*. The frequencies 
are given in Table || and are chosen to be where the con- 
straints are strongest or most relevant. The correlation 
between different frequency channels is then encoded in 
the matrix R. 

We will assume that the mean frequency dependence 

p f p f f' 

Q// and the frequency correlations R( fc ) are independent 

of I for all foregrounds. We take this frequency-scale sep- 
arability as the operational definition of a distinct compo- 
nent; however, this is not necessarily true for the physical 
components of §|[ One could imagine decomposing these 
emission mechanisms into multiple components to take ac- 
count of changes in frequency dependence as a function of 
scale. 

Let us label the detector noise as k = 1. Then O^j 
is simply the rms detector noise level in the channel / 
for polarization-type P. If this noise is uncorrelated be- 
tween channels, we have = I, the identity matrix. 

On the other hand, if the k th foreground component has 
the same spectrum f(v) everywhere in the sky, it will 
have oc f(u) and hence Q^A cx f{v) and R^ = E, 

where E-^ = 1 is the rank 1 matrix containing only ones. 
Note that the CMB fluctuations fall into this category, 
i.e., R*m) = E, since their temperature is the same in all 
channels. Real- world foregrounds will typically have cor- 
relation matrices R(fc) that are intermediate between these 
two extreme cases of perfect correlation (R = E) and no 
correlation (R = I). 

Since we presently lack detailed measurements of the 
foreground correlation matrices R, we will use the simple 
one-parameter model 
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R (fc) 



exp 
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(25) 



that was derived in T98. We also explore some alterna- 
tive models in The model parameter £, the frequency 
coherence, determines by how many powers of e we can 
change the frequency before the correlation between the 
channels starts to break down. The two limits £ — > and 
£ — > oo correspond to the two extreme cases R = I and 
R = E that we encountered above. The T98 derivation of 
equation (|25|) shows that for a spectrum of the type 



one has as a rule of thumb that 

e. 1 



V2Aa 



(27) 



Here Aq is the rms dispersion across the sky of the spectral 
index a, and / is some arbitrary function. 

The factorization into O and R in equation ( p4[ ) is ap- 
propriate for the T, E, and B block-elements, because 
these elements, like the C matrix itself, must be symmet- 
ric. The block-elements X are off-diagonal and therefore 
need not be symmetric. Asymmetries indicate that the 
correlation of the intensity at frequency Vf and the E- 
polarization at frequency i>f> differs from that of the in- 
tensity at Vfi and polarization at Vf. We have no data to 
inform any specification of such asymmetries; therefore, we 
adopt the same symmetric form for the X elements as for 
the diagonal elements. In we do consider asymmetric 
parameterizations of these off-diagonal elements. 

In conclusion, our foreground model involves specifying 
the three quantities given in for each physical compo- 
nent k and each of the four types of polarization power 
(P = T, E, B and X): its average frequency dependence 
0£a(i/), its power spectrum Cf, k -. and its frequency coher- 
ence Zfry 

4. HOW ACCURATELY CAN FOREGROUNDS WITH 
KNOWN STATISTICAL PROPERTIES BE REMOVED? 

In this section, we use our foreground models to com- 
pute the level to which foregrounds can be removed. This 
is important for identifying which foregrounds are most 
damaging and therefore most in need of further study. It 
is also useful for optimizing future missions and for assess- 
ing the science impact of design changes to, e.g., Planck. 

The treatment in this section assumes that the statis- 
tical properties of the foregrounds (power spectrum, fre- 
quency dependence and frequency coherence) are known. 
In practice, these too must of course be measured using 
the data at hand, and we will treat this issue in §0. 

4.1. Foreground removal 

Foreground removal involves inverting the (usually over- 
determined) system of noisy linear equations (|T5|). Which 
unbiased estimate x of the CMB map x has the small- 
est rms errors from foregrounds and detector noise com- 
bined? Physically different but mathematically identical 
problems were solved in a CMB context by Wright (1996) 
and Tegmark (1997a), showing that if (n) = (see foot- 
note |]), then the minimum- variance choice is 



x = [A'N^A^A'lNrV, 



(28) 



(26) 



where N = (nn*). Tegmark (1997a) also showed that this 
retains all the cosmological information of the original data 
sets if the random vector n has a Gaussian probability 
distribution, regardless of whether the CMB signal x is 
Gaussian or not.^j 

8 In other words, this foreground removal method is information- 
thcoretically "best" (lossless) only if the foregrounds have a multi- 
variate Gaussian probability distribution. Generally they do not, in 
which case the advantage of this scheme is merely that it is the lin- 
ear method that minimizes the total rms of foregrounds and noise. 
Simulations by Bouchet et al. (1995) have show that linear removal 
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Substituting equation (p8| ) into equation (|l5| ) shows that 
the recovered map is unbiased ((x) = x) and that the pixel 
— x has the covariance matrix 



noise £ 



£ = (ee*) = [A'N^A]" 



(29) 



As long as (nr) = 0, the map x remains unbiased even if 
the model for the noise covariance N is incorrect. As de- 
scribed in T98, this method generalizes and supersedes the 
multi-frequency Wiener filtering technique for foreground 
subtraction of TE96 and Bouchet et al. (1996)^, and re- 
duces to the special case of Dodelson (1996) for N = I. 
Note that whereas the full covariance matrix C was needed 
to compute the general Fisher matrix in §3.3, only the 



covariance N of the noise and foreground components is 
needed here. This is because we do not care about sam- 
ple variance when the parameters to be estimated are the 
CMB sky temperatures (p = x). In short, the foreground 
removal method described here requires no assumptions 
whatsoever about the CMB sky — we are not even assum- 
ing that the CMB fluctuations are isotropic or Gaussian. 

4.2. How the different frequencies get weighted 

Expanding our data in spherical harmonics as above, we 
subtract the foregrounds separately for each multipole at m 
using equation (|28|). The relevant vectors and matrices 
reduce to 
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schemes arc quite effective even when faced with non-Gaussian fore- 
ground templates. However, non-linear techniques taking advan- 
tage of the specific form of foreground non-Gaussianity can under 
some circumstances perform even better: e.g., the maximum entropy 
method (Hobsojiet al. 1998), the filtered threshold clipping for point 
sources as in § j2.6| (Tegmark & de Oliveira-Costa 1998; Refregier et al. 
1998), or other techniques (Ferreira & Magueijo 1997; Jewell 1999). 
An additional advantage of linear methods is that their simplicity 
allows the properties of the cleaned map to be computed exactly, 
which facilitates its interpretation and use for measuring cosmologi- 
cal parameters. For the linear method we describe, the cleaned map 
is simply the sum of the true map and various residual contaminants 
whose power spectra can be computed analytically. 

9 TE96 proposed modeling spectral uncertainties in a given fore- 
ground by treating it as more than one component. For example, 
dust emission could be modeled as 



(30) 



for a set of small emissivity variations Ei. It is easy to show that the 
T98 method is recovered in the limit c — > oo. The simplest case with 
c = 2 and e± = —e 2 = e gives the special case explored in the Planck 
HFI proposal (AAO 1998; BG99) and also tested for MAP (Spergel 
1998, private communication): 



I v = a 1 B{v)\ — \ + a 2 B(iy)i—j 



biB(v) — 



- b 2 B(u) ( — 



In — 



(31) 



(32) 



if |e In i// 1/* | 1, where b\ = a\ + a 2 and b 2 = (ai — a 2 )e. In our 
formalism, this TE96 two-component model simply corresponds to 
the approximation that the matrix R, has rank 2. 



The F-dimensional vectors &J m , &f m and &f m give the mea- 
sured multipoles at the / different frequencies, i.e., the 
data we wish to use to estimate the CMB multipoles in 
x. e is the .F-dimensional row vector consisting entirely 
of ones, A is the 3F x 3 scan strategy matrix for a given 
(£, m)0 and Nj , Nf , Nf and Nf are the F x F power 
spectrum matrices of the non-cosmic signaLbuilt by sum- 
ming the covariance matrices of equation (24), e.g., 



K+l 

Nf=£ C W 

k=l 



(35) 



Equation ( |2q ) thus gives the solution x^ m = W|y£ m , where 
we can write 



We = Af ( r 1 A[A t Af ( r 1 A]~ 1 — ( wf wf | (36) 

wf 



for some F-dimensional weight vectors w, so 
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(37) 
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(39) 



Since these are by construction unbiased estimators of the 
true multipoles, the weight vectors clearly satisfy e • wf — 
e • wf = e • wf = 1 (the estimates are weighted averages 



of the different measurements) and e • w 



i" 







(there is no mixing of polarizations). If the foregrounds 
and the detector noise lack correlations between T and E, 
i.e., if Nf = 0, then Af becomes block-diagonal and the 
solution simplifies to wj = wf = 0. 

These weight vectors are plotted for MAP in fig ures 
and 1^, and some Planck examples will be shown in §4.6.1. 
We have simplified these figures by using the approxima- 
tion of ignoring foreground correlations between the T and 
E maps. In other words, we plot the best choice of weight- 
ing satisfying wf = wf =0. It is generally possible to 
do slightly better. 

A number of features of these figures are easy to inter- 
pret. The foreground- free case of Figure |] corresponds to 
a standard minimum-variance weighting of the channels. 
Although the MAP specifications are such that all five 
channels are equally sensitive on large scales, the higher 
channels get more weight on small scales because of their 
superior angular resolution. Although Figure [l(] shows 
that things get more complicated in the presence of fore- 
grounds, we recover this familiar inverse- variance weight- 
ing in the limit where foregrounds are less of a headache 
than detector noise, here for I > 300. On angular scales 
where foregrounds constitute a major problem, the weight- 
ing scheme works harder to subtract them out: the weights 
must still add up to unity, but now some of them go neg- 
ative and others become as large as 3. For instance, large 
positive weight is given to the 60 GHz channel on large 
scales, balanced against a negative weight at 90 GHz (to 
subtract out vibrating dust) and 40 and 22 GHz (to remove 

10 A. takes on this trivial fpxm_rlue to the renormalization of yi m 
and Xf m to the true sky in § 3.3.l| , i.e., since beam effects have been 
eliminated. 
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gives a 3 x 3 covariance matrix of the form 



Fig. 9. — The weights wj with which the 5 unpolarized MAP 
channels are combined into a single map are plotted as a function of 
angular scale I for the case of no foregrounds. Similar plots for the 
Wiener filtering method can be found in AAO (1998) and BG99. 
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(40) 

where Nf, Nf and Nf are the cleaned power spectra of 
the non-cosmic signals in the T, E and B maps, and Nf is 
the cross-correlation between T and E. These four power 
spectra are plotted in the rightmost panels of figures [l] |g| 
for the cleaned Boomerang, MAP and Planck maps. 

Note that although the CMB power spectrum emerges 
unscathed from the map merging process (since the weights 
were always normalized to add up to unity), the input 
power spectra of the various foregrounds generally get 
their shape distorted (Nf ^ Nf). This is because the 
weighting is different for each lvalue, typically suppress- 
ing foregrounds by a greater factor on those angular scales 
where they are large and damaging than on scales where 
they are fairly negligible. Indeed, the rightmost 3 panels 
of Figures nj-o show that rather complex power spectrum 
features can become imprinted on the least important fore- 
grounds, as the need to subtract out more important fore- 
grounds shifts the relative channel weights around. 




Fig. 10. — Same as Figure but for the MID foreground scenario. 



synchrotron, free- free and spinning dust emission). The 
greater the assumed amplitudes are for the foregrounds, 
the more aggressively the cleaning method tries to subtract 
them out with large positive and negative weights. The 
price for this is of course that the residual detector noise 
becomes larger than for the minimum-variance weighting 
of Figure ||. 

4.3. The three cleaned maps and their power spectra 

Transforming the cleaned multipoles ai m back into real 
space, the final result of this foreground subtraction proce- 
dure is three cleaned maps of the CMB: one intensity map 
T and two polarization maps E and B. Equation (29) 



4.4. Power spectrum error bars 

How accurately can we measure the four CMB power 
spectra from these three cleaned maps? If we parameter- 
ize our cosmological model directly in terms of the CMB 
power spectrum coefficients, i.e., 
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(41) 



we can answer this question by computing the correspond- 
ing 4x4 Fisher matrix F^. Our measurement xi„, of the 
3-dimensional multipole vector X£ m from equation ( p3| ) has 
a covariance matrix 
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Here Cf, Cf, Cf and Cf are the total power spectra in 
the cleaned maps, combining the contributions from CMB, 
detector noise and foregrounds, e.g., Cf = C^ CMB ^ +Nf. 
Since 5q,„ is by assumption Gaussian-distributed, our 
sought-for 4x4 Fisher matrix F^ is given by 
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which after some algebra reduces to 
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where Dg = TgEg — X 2 . We have used the shorthand con- 
vention Pi = Cf here (and only here) for space reasons. 
This is the information content in a single multipolc x^ m . 
Since jwe have effectively have {2£ + l)/ s ky independent 
modesO that each measure the four power spectrum coeffi- 
cients in p/, the full F^ is {2£ + l)/ s ky times that given by 
equation (f44|). Inverting this matrix gives the best attain- 
able covariance matrix M for our 4- vector pg of measured 
power spectra: 



M = / S ^(2f+1)- 1 F,- 1 , 



where 
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(46) 



Zaldarriaga et al. (1997) showed that this same covari- 
ance matrix was actually obtained when measuring the 
power spectrum in the maps in the usual way, with esti- 
mators {2£ + l)~ 1 J2m=-e \ a im\ 2 , which demonstrates that 
this method retains all the power spectrum information 
available. 

The analogous derivation of the Fisher matrix for other 
parameters upon which the CMB power spectrum de- 
pends, say a parameter vector p' of cosmological parame- 
ters (h, Qb, etc.), shows that it can be expressed in terms 
of this matrix: 



dp 



dp' 



(47) 



The variance of a measured power spectrum coefficient 
Cf is given by the corresponding diagonal element of 
Equation (^) , so the error bars take the particularly sim- 
ple form ACf = [{2l+l)f sky /2]-^ 2 C[. Let us define the 
degradation factor DF as the factor by which these error 
bars increase in the presence of foregrounds. For the T, E 
and B cases, we have ACf cx Cf , so this factor becomes 
simply 
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(48) 



where Cf, c \ 

£(forcg) 



is the sum of the powers Cf k <. of all fore- 



ground components. The expression becomes more com- 
plicated for the X case, where {ACf ) 2 = [{Cf ) 2 + 
Cj Cf ]/2. These degradation factors are plotted in Fig- 
ure 11^ for the T and E cases, for Boomerang, MAP and 
Planck and our PESS, MID and OPT scenarios. Here 
and throughout, we use the ACDM cosmology presented 

in §y. 

For the T case, we see that foregrounds never increase 
error bars by more than a factor of 10% in the MID sce- 
nario and 2 in the PESS scenario. For E, the MID fore- 
grounds never cost more than a factor of two, whereas 
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Fig. 11. — Degradation factors. The fraction DF — 1 by which 
foregrounds increase the power spectrum error bars is shown for 
Boomerang (top), MAP (middle) and Planck (bottom). Each 
shaded band shows the range of uncertainty between the PESS and 
OPT models, with the MID case indicated by a heavy curve. The 
lighter band is for intensity T, the darker one for E polarization. 



the PESS case degrades Planck (which has the most to 
lose because of its high sensitivity) about twenty-fold at 
£ ~ 10. Since noise is negligible at low £, the foregrounds 
are competing only with sample variance here, and so the 
E degradation is caused by the polarized foreground power 
being substantial compared to the CMB power. Since de- 
tector noise always dominates at high £, the degradation 
asymptotically goes away as I — ► oo. For the unpolarized 
case, the degradation is seen to be worst between these two 
limits, around the beam scale of each experiment, where 
point-source power can become comparable to both CMB 
and noise. 

Two other foreground degradation measures have been 
previously used in the literature. The most closely re- 
lated one is the foreground degradation factor of Dodelson 
(1996), which is the ratio of the rms noise in the cleaned 
maps for the cases with and without foregrounds. This 
assumed that foregrounds could be subtracted out com- 
pletely, i.e., that Act = and that there were more com- 
ponents channels than foregrounds. The "quality factor" 
(Bouchet et al. 1999; BG99) is the amount by which mul- 
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tifrequency Wiener filtering suppresses the power of the 
CMB in the cleaned map, assuming Act = 0, and was 
defined for each foreground component separately. The 
most important difference is that our degradation factor 
is relative to the noise plus sample variance, since our fo- 
cus is on power spectra and measurement of cosmological 
parameters. 

4.5. Dependence on assumptions about amplitude 

The MID model in Figure [ll] gives our estimate of how 
small the noise and foreground power spectra can be made 
in the cleaned maps, while comparing the OPT and PESS 
models indicates the range of uncertainty. Let us now 
discuss the effects of model assumptions in more detail. 

The foreground behavior enters in two different ways: 

1. The assumed foreground behavior determines the 
weights w that we use when cleaning the maps. 

2. The true foreground behavior determines the actual 
foreground residual in the cleaned maps. 

Let us make this distinction explicit by using Af a to de- 
note our assumed foreground matrix (our prior), as dis- 
tinguished from the true matrix A^. The resulting fore- 
ground contamination is then given by (suppressing the £ 
subscript) 

S = [AM- 1 AT 1 [A'N^MN^A] [AW^A]' 1 , (49) 



which only reduces to equation ( |40| ) if Af a = A/", i.e., if 
our model is correct. We will still recover unbiased CMB 
maps T, E and B even if our model is incorrect, but gener- 
ally with larger foreground contamination than would be 
attainable with a correct model. 



Equation (49) shows that the contaminant power spec- 
tra in S depend linearly on J\f. Thus the complicated 
residual foreground power spectra depicted in the 3 right- 
most panels of Figures (computed equation ([l9|) by 
taking J\f to be the contribution from a single foreground 
component) may be thought of as the result of multiply- 
ing the rather featureless foreground power spectra that 
are actually on the sky by known transfer functions. 

4.6. Dependence on assumptions about frequency 
coherence 

4.6.1. Effect of changing Act 

In general, the less coherent a foreground is, the more 
difficult it is to remove. Figure |lj shows this effect. All 
panels use the scale and frequency dependence of the MID 
model, but with the frequency coherence spanning the 
range between the extreme cases £ = oo and £ = 0. As 
expected, the situation generally gets worse as we progress 
from ideal perfectly coherent foregrounds (bottom curve) 
to realistic (middle three curves) and completely incoher- 
ent ones (top curve), The incoherent case corresponds to 
no foreground subtraction whatsoever, simply averaging 
the Planck channels with inverse-variance weighting. 

While less coherence is usually a bad thing, Figure [lj 
shows a subtle exception to this rule at I ~ 100. Here weak 
coherence is seen to be worse than no coherence at all. 

■ ^| shed more light on this perhaps surprising 
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Fig. 12. — The effect of frequency coherence. The total power 
spectrum from noise and foregrounds in the cleaned Planck T map 
is shown for five different assumptions about frequency coherence, 
corresponding to multiplying all values of Aa from the MID model 
(heavy curve) by oo, 2, 1, 0.5 and 0, respectively (from top to bot- 
tom). The fiducial CMB power spectrum is shown for comparison. 
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Figures 

behavior by showing how the channel weighting changes 



Fig. 13. — The weights wj" with which the unpolarized maps 
at the 9 Planck frequencies are combined into a single map are 
plotted as a function of angular scale I for the MID model, but with 
completely incoherent foregrounds (Act = oo). 



as we increase the frequency coherence. These figures cor- 
respond to three of the five curves in Figure O (the top, 
middle and bottom ones). Figure O shows the case of 
completely incoherent foregrounds (Aa = oo). It gives an 
inverse-variance weighting just as in Figure ^|, but with 
the variance receiving a contribution from foregrounds as 
well as noise. In Figure 111, we see that the poor coherence 
between widely separated channels is forcing the method 
to do much of the foreground subtraction using neighbor- 
ing channels, using costly large-amplitude weights at 100, 
143 and 217 GHz on large scales. In contrast, the case of 
ideal foregrounds shown in Figure 15 is seen to be much 
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Fig. 



14. — The weights wT with which the unpolarized maps at 



the 9 Planck frequencies are combined into a single map are plotted 
as a function of angular scale I for the MID foreground model. 
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Fig. 16. — The effect of faulty assumptions about frequency co- 
herence. The total power spectrum from noise and foregrounds in 
the cleaned Planck T map is shown for the MID model using three 
different assumptions for the cleaning process: the correct (MID) 
Aa- values (bottom), Aa = oo (middle, cautious) and Act = (top, 
foolhardy). The fiducial CMB power spectrum is shown for com- 
parison. 
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Fig. 15. — The weights wj with which the unpolarized maps at 
the 9 Planck frequencies are combined into a single map are plotted 
as a function of angular scale i for the MID model, but with perfectly 
coherent foregrounds (Aa = 0). 

easier to deal with, requiring no weights exceeding unity 
in amplitude. For instance, the small dust contribution at 
low frequencies can be subtracted out essentially for free, 
by a tiny negative weight for the dust-dominated 545 GHz 
channel. 

The non-monotonic behavior (where things eventually 
start improving again when the coherence becomes suf- 
ficiently low) does not occur if there is merely a single 
foreground component present with no detector noise. In- 
stead, it results from an interplay between foregrounds and 
noise. A perfectly incoherent foreground can be efficiently 
dealt with in the same way as noise: by inverse-variance 



weighting the channels as in Figure [13| the incoherent fore- 
ground fluctuations will average down, whereas a coherent 
foreground would not. Typically, the worst case is for £ 
of order unity, although the exact value depends on the 
other foregrounds. Figure O thus shows that although we 
do not appear to live in the worst of all possible worlds, 
we are only off by a small factor! 

4.6.2. Effect of incorrect assumptions about Aa 

What if our model is incorrect? Figure [l6] uses equa- 
tion (^) to show the effect of two kinds of errors: being 
too optimistic and being too pessimistic about ones abil- 
ities to model the frequency dependence of foregrounds. 
For all there curves, the MID model is used as the truth, 
but the weights w for the foregrounds subtraction are for 
different assumptions about the frequency coherence. Not 
surprisingly, correct assumptions give the best removal, 
showing the importance of accurately measuring the ac- 
tual frequency coherence of foregrounds. The curve la- 
beled "cautious" shows the result of assuming incoherent 
foregrounds (Aa = oo), corresponding to no foreground 
subtraction at all, merely inverse-variance averaging with 
no negative weights as in Figure |l3|) whereas the one la- 
beled "foolhardy" illustrates the effect of assu ming ideal 
foregrounds (Aa = 0, using the weights of Figure |l5|) . The 
fact that the former generally lies beneath the latter shows 
that when faced with uncertainty about Aa, it is better to 
err on the side of caution: in our example, foreground re- 
moval based on the overly optimistic model is performing 
worse than no foreground removal at all. 

4.6.3. Effect of functional form of coherence 
We can rewrite our coherence model of equation ( p5| ) as 
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Fig. 17. — Changing the functional form of frequency coherence. 
The total power spectrum from noise and foregrounds in the cleaned 
Planck T map is shown for the MID model using different shapes 
for the coherence function /, shown with the corresponding line 
type in the inset. The shapes are Gaussian (heavy solid curve), 
Lorentzian (short-dashed), exponential (thin solid curve), flat (dot- 
ted) and completely incoherent (long-dashed). The fiducial CMB 
power spectrum is shown for comparison. 

where f(x) = e~ x I 2 . The derivation in T98 did not 
show that fix) was a Gaussian, merely that behaved like 
a parabola at the origin with /(0) = 1, f'(x) = and 
/"(0) = — 1. Since / (which we will term the coherence 
function) has not yet had its shape accurately measured 
for any foreground, we repeat the Planck analysis for a 
variety of such functions of the form 

(2 \ ~~ " 
1 + k) ■ < 51 > 

The case n = oo recovers the Gaussian of equation (p5|), 
n = 1 gives a Lorentzian, etc. 

Figure |l7] shows that the shape of the far wings of / is 
only of secondary importance — the main question is how 
correlated neighboring channels are, which for £ 3> 1 de- 
pends mainly on the curvature of / near the origin. Nar- 
rowing the wings (increasing n) usually helps slightly, once 
again demonstrating that more coherence is not necessar- 
ily a good thing. For comparison, Figure [l?] shows the case 
where f(x) = 1 at the origin and vanishes elsewhere and 
the case f(x) = 1, corresponding to the limits Aa = oo 
and Aa = 0, respectively. Also shown is the exponential 
coherence function f{x) = ex$[\x\/y/{2). This is seen to 
be quite a conservative choice, giving even larger residu- 
als than the Aa = oo case, since the correlations between 
neighboring channels arc strong enough to be important 
but not good enough to be really useful for foreground 
subtraction. A generalization of this exponential coher- 
ence function will come in handy in §|5|, where our goal is 
to be as pessimistic as possible with the intent to destroy 
parameter estimation with foregrounds. 



5. SIMULTANEOUS ESTIMATION OF FOREGROUNDS AND 
COSMOLOGY 

To this point, we have considered the case in which the 
statistical properties of the foregrounds are known exactly. 
Moreover we have shown that incorrect assumptions about 
these properties can lead to foreground removal strategies 
that do more harm than good. This begs the question of 
whether we will in fact know these foregrounds to the level 
needed for accurate subtraction. The sky maps from CMB 
satellite missions will provide some of the most relevant 
data on this question. Hence, we will next consider the 
case in which cosmological parameters and the foreground 
model are simultaneously estimated from the CMB data. 

At this point, the calculation becomes less well-defined. 
If we are allowed to consider arbitrary excursions around 
the fiducial model, then cosmological parameter estima- 
tion fails completely. There is no mathematical way to 
exclude a foreground that matches the CMB frequency 
dependence, is perfectly coherent, and has an arbitrarily 
inconvenient power spectrum (say, one mimicking the vari- 
ation of a cosmological parameter). Physically, however, 
we believe this to be unreasonable. We therefore must con- 
struct a parameterized model of foregrounds that allows 
for a reasonably, but not completely, general coverage of 
the possibilities. 

5.1. Cosmological Parameters 

We adopt a low-density, spatially flat adiabatic CDM 
model for our cosmology. The model has a matter den- 
sity of n, n = 0.35, a baryon density of fib — 0.05, a 
massive neutrino density of Q v = 0.0175 (one massive 
species with a mass of 0.7 eV), and a cosmological constant 
Qa = 0.65. The primordial helium fraction is Yp = 0.24. 
The Hubble constant is H = lOO/i km s" 1 Mpc" 1 = 
65 km s _1 Mpc -1 . The universe is reionized suddenly at 
low redshift with an optical depth of r = 0.05. The pri- 
mordial power spectrum is scale-invariant, so the scalar 
spectral index is ns = 1. There are no tensors (T/S = 0). 
The model is normalized to the COBE-DMR experiment. 
This is the same model that was used in Eisenstein et al. 
(1999, hereafter "E99"), and further details on the above 
choices can be found there. 

We will ask how well CMB data can constrain a 10- 
dimensional excursion around this parameter space. The 
parameters are Q m h 2 , Slbh 2 , fl„h 2 , Q\, r, Yp (constrained 
to vary by 0.02 at 1-er), ns, n' s , T/S, and the normaliza- 
tion, ns' (denoted a in E99) is the running of the scalar 
tilt, 

n s (k) = ns(fc fi(i ) +ri' s ln(fc/fc fid ), (52) 

where fcgd — 0.025 Mpc -1 . Note that Q m = 1 — and 
h = y/ [yi m h 2 )/ (1 — 17a) are defined implicitly and hence 
can vary. This parameter space is identical to that of E99 
except that we have not included spatial curvature. There 
is a severe degeneracy between curvature and the cosmo- 
logical constant (Bond et al. 1997; Zaldarriaga et al. 1997). 
Since this degeneracy is best broken by using non-CMB 
data {e.g., galaxy redshift surveys or SN la), we choose to 
focus only on the well-constrained combination of the two 
parameters here. Details of how we perform the numerical 
derivatives of the power spectra with respect to these cos- 
mological parameters can be found in E99. Fortunately, 
all the derivatives with respect to the foregrounds (§||) can 



21 




Frequency i/[GHz] 

Fig. 18. — The figure shows the four triangle functions Li that 
span the excursions in frequency dependence for foregrounds seen by 
MAP. Any function that is piecewise linear between the four break 
points (e.g., the heavy curve) can be written as a linear combination 
of these functions in this range. 



be done analytically, so that no new numerical problems 
are introduced. 

5.2. Foreground Parameters 

We allow for uncertainty in the foregrounds by adding a 
large number of parameters to the models specified in §|[ 
Recall that each component was specified by a frequency 
dependence, a frequency coherence, and a power spectrum 
for each polarization type (T, E, B, and X). For each type 
and each component, we now include parameters to allow 
excursions in frequency dependence, frequency coherence, 
and spatial power. As described below, we use of order 
fifty additional parameters denoted by vectors q, r and s 
for each foreground component. 

For the frequency dependence, we allow a piecewise 
power-law excursion in thermodynamic temperature around 
the fiducial model 



hxQf k) (u) = lnQf k) (u) 



L In 



ii<i 



(53) 



Here, the function L is a linear interpolation between the 
values (q = q\, . . . , q nv )^ at the breakpoints v 1 , . . ., i/ n " , 

so L(lnz/;q.o\) = qf (k) and is a straight line between 



i{k)l — a i{k) 
and the fidi 
This means that wc can rewrite it as 



breakpoints and the fiducial model ("fid") has q p 



L(ln^;q) = } q^^lny), 



(54) 



where the functions Li have triangular shape as illustrated 
in Figure 18: Li(\nv l ) — 1, goes linearly to zero at the 
neighboring breakpoints, and vanishes everywhere else. 

We allow a separate frequency excursion for T, E, and 
B for each foreground component. For the cross correla- 
tion we allow for the possibility that the excursions in the 
correlation of the temperature at v and .E-polarization at 
v' is not the same as the temperature at v' with the E- 
polarization at v. We therefore define two excursions for 



the cross-correlation Cf^(v, v') oc {v)®y5 (v 1 )- One 
of the excursions affects the temperature index, the other 



the polarization index. O 



lid 



<P) X 

U (fc) 



and likewise for 



Xe since we have assumed that the fiducial model is sym- 
metric in this respect. 

Since the breakpoints v i specify the degrees of freedom 
in the foreground model, they are independent of the num- 
ber and location of observing frequencies vi for any given 
experiment. We choose them to be evenly spaced in lnj/, 
with a factor of 2 in frequency between each, and are cen- 
tered at the geometric mean of the highest and lowest fre- 
quencies of the experiment. This means that we specify 
n v = 4 breakpoints for Boomerang (67.1, 134, 268, and 
537 GHz) and MAP (15.7, 31.5, 62.9, and 126 GHz) and 
n v = 6 for Planck (28.3, 56.7, 113, 227, 454, 907 GHz). 

In the n v = 2 limit, the new parameters correspond to 
varying the normalization and power-law exponent of the 
frequency dependence of the foreground component. We 
chose to extend this freedom by piecewise-linear interpo- 
lation rather than smoother options for technical reasons. 
Splines have non-local behavior; frequencies far outside 
the range of the CMB experiment would affect the re- 
sults inside the range. Polynomials are scale-free, so that 
a polynomial of a given order will adapt itself to the partic- 
ular experimental specifications so as to put the maximal 
number of wiggles inside the frequency range. This would 
mean that the effective number of degrees of freedom in 
the foreground model would not be consistent from one 
experiment to the next. Simple interpolation avoids these 
problems: one need not specify any breakpoints beyond 
the first outside the frequency range of the experiment, 
and the scale for variations in the foreground is indepen- 
dent of the experiment. 

We don't allow frequency variations for the thermal SZ 
component, bec aus e its spectrum is theoretically known. 
As discussed in §2^, relativistic corrections are expected to 
be negligible for filaments, so we ignore this complication. 

To include these parameters in the Fisher matrix of 
equation (|2^), we must specify the derivatives of Ci with 
respect to the interpolation parameters q^ . Clearly only 
the elements of the same component and polarization type 
are affected. For P = T, E, and B, the derivatives of that 
submatrix are 



dC £(k) 



dq P 



= C^(f[L i (hxu f )+L i (lnu f ,)}, (55) 



lid 



where we have used equation (|54|). For P = X, the deriva- 
tives are 



an x ff 

OU £(k) 



c f(l) L *( ln ^/)> 



dC, 



xff 



(56) 



(57) 



For the frequency coherence, we adopt the exponential 
model for the matrix R using 



exp 



A(v)d(]nv) 



Vf 



(58) 
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In the fiducial model, A(V) = Aa = l/v2£ independent 
of frequency. Now we allow excursions of the form 



A(v) = Aa + L(lnv;r), 



(59) 



where L is a linear interpolation function as before, pa- 
rameterized by r = 71 , . . . , r nv . We use the same spacing 
of the interpolation points v , . . . , v n " as in the frequency 
dependence case described above. 

As above, for each foreground component except SZ, wc 
allow separate excursions for each of T, E, and B and two 
excursions for X. The derivatives of C are (P — T, E, B) 



°^t(k) _ r Pff 

Q P ^l(k) 



d(ln v)Li(hi v) 



(60) 



For the cross-correlation X, we allow for an asymmetric 
excursion by invoking two excursions of the above form 
and setting cither the upper (/ < /') or lower (/ > /') 
triangle elements to zero. 

For the spatial power, we consider excursions of the form 



lid 



L(ln£;s), 



(61) 



where L is a linear interpolation function. The parameters 
s = si, . . . , s ne are the values of L at a grid of £ that begins 
at £ = 2 and increased by factors of e. As we sum the 
Fisher contribution to £ max = 2800, this gives ng = 9 grid 
points. However, the overall normalization (i.e. moving 
all the Si the same amount) is degenerate with an overall 
shift in the frequency dependence (eq. (53)), so in cases 
other than the thermal SZ, we hold the middle spatial Sj 
{£ — 109) equal to zero. 

Separate excursions are allowed in P = T, E, B, and 
X for each foreground component, including the SZ effect. 
The derivatives of C with respect to these parameters are 



^ C f(fc) 
as i(k) 



(62) 



where all terms involving different k and P are zero, as 
before. The Li spatial functions are defined analogously 
to equation (p4). 

In short, for most foreground components, we have 
lOn^ + Ane — 4 free parameters in (q, r , s) . For the thermal 
SZ, we have only 4n^. However, for unpolarized compo- 
nents, the parameters for polarization excursions have zero 
derivatives, so we remove them. This leaves 385 (489) pa- 
rameters for MAP (Planck) for the MID model, 257 (325) 
for the OPT model, and 441 (561) for the PESS model. 
For the MID model, 105 (129) of the parameters are as- 
sociated with the intensity anisotropies, so there are 105 
parameters for our Boomerang estimates. 

We allow these parameters to vary without external pri- 
ors in almost all cases. As we will show below, the CMB 
experiments are able to constrain the foreground model 
well enough to extract the cosmic signal. The one excep- 
tion is the thermal SZ effect with the MAP experiment. 
The frequency dependence of the SZ is similar to the cos- 
mic temperature variations for frequencies much below 200 
GHz. If we allow unfettered variations in the SZ spatial 
power spectrum, there are significant degradations in the 



performance of MAP on cosmological parameters. How- 
ever, these degeneracies correspond to very large depar- 
tures from the fiducial SZ level. We therefore include a 
prior (for MAP only) that the SZ power spectrum can- 
not vary by more than a factor of 10 from the fiducial 
level (i.e., the parameters Zi cannot exceed 10 at 1-a con- 
fidence). This is an extremely generous prior — numerical 
simulations are surely correct to within a factor of 10 at 
68% confidence — but it substantially reduces the degrada- 
tion of the MAP performance in the presence of SZ sig- 
nals. In detail, for the MID model, MAP with T alone 
could measure f2f,/i 2 to 0.0036 with the SZ variations be- 
ing omitted, 0.0037 with the prior described above, and 
0.0075 with a prior of 10 6 on the SZ variations. For this 
and other parameters, the prior of 10 removes variations 
in the SZ as a source of cosmological uncertainty in the 
MID model. The PESS model, with its 10-fold increase in 
the fiducial SZ level, has 10-20% differences between re- 
sults with a prior of 10 and those with no SZ variations. 
Planck and Boomerang can control the SZ to better than 
a factor of 10, so no prior is applied. 

5.3. Cosmological Parameters in the Presence of 
Foregrounds 

Because the variations in the foreground model have ef- 
fects at all £, we cannot express the effects of the fore- 
grounds as a simple degradation of the error bars at each 
multipole (c.f. Fig. O). Excursions from the fiducial model 
produce changes infrequency and scale dependence that 
can compensate both each other and cosmological signals 
in complicated ways. In other words, with this more com- 
plicated foreground model, one does not recover a cleaned 
CMB power spectrum as an intermediate step of the anal- 
ysis, but must proceed directly to the estimation of the 
parameters characterizing the models for foregrounds and 
cosmology. To quantify the effects of the foregrounds, we 
will therefore simply compare the final marginalized error 
bars on cosmological parameters. 

For display purposes, we will focus on the performance 
on four parameters, chosen to illustrate important aspects 
of the interplay between foregrounds and cosmological sig- 
nals: 

1. The baryon density f^/i 2 as measured only from the 
temperature information. This parameter is sensi- 
tive to the structure of the acoustic peaks and to the 
diffusion scale (Hu & Sugiyama 1995). 

2. flbh 2 as measured from both intensity and polariza- 
tion. The polarization of the acoustic peaks sub- 
stantially improve the accuracy with which this pa- 
rameter can be measured (Zaldarriaga et al. 1997) 

— mai nly th rough the X power spectrum, as we will 
see in § 5.4.5| . 



3. The reionization optical depth r as measured from 
intensity and polarization. This is dominated by the 
-©-channel signal at large angular scales (Hogan et al. 
1982), thereby testing how well the diffuse polarized 
galactic signals can be removed. 

4. The tensor-to-scalar ratio T/S as measured from in- 
tensity and polarization. This is the only cosmic 
signal in the B-channel polarization (Kamionkowski 
et al. 1997; Zaldarriaga & Seljak 1997). 
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Table 3 



Marginalized Errors with Foreground Variations 



5 



Foregrounds 



Experiment 


Quantity 




None 


Known 


Unknown 


Boomerang (T) 


ln(n m h 2 ) 
ln(tt b h 2 ) 
m v (eV) oc 


n v h 2 


0.45 
0.27 
3.5 


1.007 
1.008 
1.006 


1.282 
1.242 
1.51 




ras(fcfid) 

n A 




0.30 
0.57 


1.007 
1.007 


1.203 
1.287 




T 

T/S 




1.3 
1.2 


1.016 
1.007 


1.87 
1.52 


MAP (T) 


\n(n m h 2 ) 
ln(n b h 2 ) 
m v (eV) oc 


n v h 2 


0.20 
0.12 
0.87 


1.027 
1.027 
1.017 


1.393 
1.453 
1.65 




rcs(fcfid) 

n A 




0.11 
0.23 


1.026 
1.026 


1.332 
1.324 




T 

T/S 




0.31 
0.42 


1.014 
1.023 


1.69 
1.240 


MAP (TP) 


ln(Q m h 2 ) 
ln(n b h 2 ) 
m u (eV) oc 


n v h 2 


0.080 
0.051 
0.57 


1.208 
1.201 
1.078 


1.66 
2.01 
2.06 




ns(kfid) 

n A 




0.041 
0.091 


1.264 
1.230 


2.63 
1.74 




T 

T/S 




0.018 
0.16 


1.90 
1.309 


3.33 
1.86 


Planck (T) 


\n(n m h 2 ) 
\n(n b h 2 ) 
m u (eV) oc 


n v h 2 


0.062 
0.035 
0.55 


1.014 
1.013 
1.010 


1.042 
1.040 
1.029 




"s(^fld) 

n A 




0.041 
0.080 


1.013 
1.013 


1.031 
1.039 




T 

T/S 




0.23 
0.18 


1.015 
1.011 


1.074 
1.035 


Planck (TP) 


ln(Q m h 2 ) 
ln(n b h 2 ) 
m u (eV) oc 


n v h 2 


0.016 
0.0094 
0.24 


1.056 
1.028 
1.032 


1.160 
1.165 
1.075 




^s(fcfld) 

n A 




0.0076 
0.022 


1.109 
1.051 


1.303 
1.151 




T 

T/S 




0.0036 
0.0073 


1.69 
4.04 


1.96 
6.58 



NOTES. — Marginalized errors for some cosmological parameters 
within the 12-dimensional adiabatic CDM family of cosmological 
models and the exponential coherence function (eq. feq|) for the 
foregrounds. "None" column lists 1 — a error in case where there 
are no foregrounds. "Known" column lists the relative degrada- 
tion when our foreground MID model is added under the assump- 
tion that the statistical properties of the foregrounds are known 
exactly. "Unknown" column lists the relative degradation when 
the statistical properties of the foregrounds must be simultaneously 
estimated within a generous parameterization of possible models. 
Results for MAP with intensity only (T) and with intensity and 
polarization (TP) are shown; likewise for Planck. ng(fcfi<j) is the 
logarithmic derivative of the scalar primordial power spectrum at 
fefid = 0.025 Mpc" 1 ; in the presence of n' s ^ 0, ng is a function of 
scale. Cosmological model is Q m = 0.35, Q b = 0.05, f2„ = 0.0175 
(m„ = 0.7 eV), tt A = 0.65, h = 0.65, n s = 1, n' s = 0, r = 0.05, 
Y p = 0.24, and T/S = 0. = and cannot vary. 



In Figures [L9| and |20|, we show the degradation of the 
MAP and Planck performance on these parameters in the 
presence of our OPT, MID, and PESS foreground mod- 
els. In each case, the baseline is the performance in the 
absence of any foregrounds at all. The degradations in 
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Progression of Foreground Models (MAP) 

Fig. 19. — Relative degradation in error bars from MAP on four 
cosmological parameters as the amplitude of foregrounds are in- 
creased, (top-left) Behavior of Q b h 2 with intensity information only 
(T). (proceeding clockwise) £l b h 2 , T/S, and r with intensity and 
polarization information (TP). Bars show the error bar for each 
foreground case relative to the no-foregrounds case; the 1 — a error 
of the latter is listed in each panel. The histograms show results for 
a series of foreground models, ranging from no foregrounds to our 
OPT, MID, and PESS models, (lightly- shaded) Results with fore- 
grounds of known properties, (heavily- shaded) Results with fore- 
grounds whose parameters must be simultaneously estimated from 
the CMB data. 
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Fig. 20.— As Figure I19I but for Planck. 



the presence of foregrounds are shown for both the case of 
known properties and the case of unknown properties. 
The performance on fl^h 2 with and without polariza- 
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tion is very encouraging. The degradations are less than a 
factor of 2 in all cases but the PESS model in MAP (where 
it reaches 4-5). Planck is able to survive even the PESS 
model with only a factor of 2 increase in the projected 
errors. The reason for this strong performance is the de- 
tailed structure of the acoustic peaks. Even if cleaning 
is imperfect, the foreground power spectra do not have 
the oscillatory behavior of the cosmic derivatives and can 
therefore be distinguished from variations in cosmological 
parameters. Note that most of the degradation can be 
attributed to uncertainties in the foreground model; the 
performance in the case of known foreground properties is 
nearly perfect. 

The situation is somewhat less rosy for the large-angle 
polarization signals, r and T/S both have unique signa- 
tures in the large-angle polarization, where signals from 
the acoustic peaks are quite weak. In the absence of fore- 
grounds, even small cosmic signals can be detected because 
their sample variance is equally low. With foregrounds, 
the signal-to-noise is considerably worse. Performance is 
correspondingly poorer, and the results do depend more 
sensitively on the severity of the foregrounds. However, 
one should note that for the OPT and MID models, even 
the large-angle polarization signal can be cleaned to rea- 
sonable accuracy, yielding excellent constraints on r and 
T/S. For the PESS model, the degradation is generally 
more than a factor of 10, although the errors for Planck 
would still be interesting (-5% for r, -10% for T/S). 

Note that the extra frequency coverage and sensitivity 
of Planck does not imply that it will necessarily suffer less 
relative degradation than MAP from the presence of fore- 
grounds; although Planck will clean more effectively, its 
baseline projections were more ambitious. 

In Table [|, we display the numerical results for Boomerang, 
MAP, and Planck with the MID foreground model. The 
errors without foregrounds are shown, followed by the rel- 
ative degradations in the presence of foregrounds with 
and without knowledge of their properties. For the satel- 
lites, we show results considering intensity information 
alone and then both intensity and polarization informa- 
tion. Boomerang is slightly more robust against fore- 
grounds than MAP, reaching 3-fold degradations in the 
PESS model. Because our foreground model has more 
components centered at lower frequencies, the higher fre- 
quency range of Boomerang may shift it away from the 
reach of the model's variations. 



5.4. Which details matter? 



5.4.1. 



Dependence on foreground amplitude 

As we increase the amplitudes of the foregrounds, which 
components most affect which cosmological parameters? 
We consider artificially boosting the amplitudes of the 
MID foreground model by a factor of 10 (i.e., a factor 
of 100 in power). In Figures |2l] and E2L we apply this 
factor of 10 increase separately to the diffuse components 
and to the SZ and point source components. Generally, 
the errors on r and T/S are primarily affected by the dif- 
fuse components rather than the point sources, while the 
results for Qi,h 2 are the reverse. However, there are some 
mild exceptions. For Planck with polarization, fi^/i 2 is 
sensitive to the amplitude of the diffuse components. This 
is because we assumed the power spectrum of the polar- 
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Diffuse vs. Point Source Amplitude (MAP) 

Fig. 21. — As Figure n9, but increasing the diffuse foregrounds 
and point sources separately. "Mid" bars refer to the MID fore- 
ground model, simultaneously estimating the foreground parame- 
ters. "Diff" bars show the results when the diffuse components (i.e. 
all those that are not point sources) are increased by 10 in ampli- 
tude. "PS" bars show the results when the point source and SZ 
components have their amplitude (after PSF fitting) increased by 
a factor of 10. "xlO" bars show the combined result. All values 
are shown as the fractional increase relative to the results with no 
foregrounds. This figure shows the case for MAP. 
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Fig. 22.— As Figure Ell but for Planck. 



ization of dust and synchrotron to be considerably bluer 
than that of the intensity. Hence, when increased in am- 
plitude, these foregrounds contaminate the acoustic peak 
structure in the polarization and degrade the performance 
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Fig. 23. — As Figure |l9| but altering the frequency coherence 
Aa of the fiducial foreground model. We scale all Act (see Table 
b|) by a factor of 0, 0.3, 1, 3, 10, and oo. As shown in T98, perfect 
coherence and perfect decoherence are the best-cases; intermediate 
values yield worse performance. All errors are shown relative to 
those in the case with no foregrounds. 



somewhat. Also, for MAP the results on T/S are sensitive 
to both diffuse and point-source components. MAP does 
not make a strong detection of the tensor signal in the 
polarization and is therefore reliant upon the large-angle 
signal in the intensity. Boosting the point source ampli- 
tude confuses the comparison between I « 50 and I « 500 
that would test for the presence of tensors. The trends in 
Figures |l| and are insensitive to whether foreground 
parameters are assumed to be known or not. 

5.4.2. Dependence on frequency coherence 



As was discussed in p.6.l| , the cleaning of foregrounds is 
usually more effective when a map at one frequency gives 
a good estimate of the foreground's presence at another 
frequency. This is governed by the covariance matrix R 
and the frequency coherence Aa. Because the parameters 
of this matrix are very poorly known at the present time, 
it is important to check that our results are insensitive to 
our choices in this sector. 

As described above, the parameter Aa sweeps between 
the two extremes of perfect correlation between frequency 
channels and total independence. As shown in T98, both 
of these cases have desirable properties for removal of fore- 
grounds. In the former case, there is a particular combi- 
nation of the frequency maps that completely removes the 
component in question. In the latter case, one uses the 
fact that any correlation between different maps must be 
cosmic signal. Of course, neither perfect correlation nor 
total independence is correct, and the intermediate case 
admits a less complete cleaning of foregrounds. 

In Figures ^3| and |24|, we show a sequence of models 
in which we scale all of the Aa's in the MID model by 
a constant that ranges from zero (perfect correlation) to 
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Fig. 24. — As Figure |23l but for Planck. 

oo (complete independence). As expected, the errors on 
cosmological parameters increase as one moves away from 
the extremes and reaches a maximum in the middle. This 
peak typically occurs when the Aa's of the foregrounds 
are multiplied by ^3, but the actual location varies from 
case to case. However, because the peak is broad, the 
errors from our base model are actually rather close to the 
maximum. We therefore conclude that our treatment of 
the covariance between the frequency channels has been 
sufficiently conservative. 

5.4.3. Dependence on foreground model complexity 

We now consider turning off certain sets of variations to 
examine which variations are causing the most degrada- 
tion. The results are shown in Table f|. We separate the 
foreground parameters into three sets, namely those in- 
volving the frequency dependence (the q£L), the frequency 

coherence (the ) , and the spatial power spectrum (the 
s^.))- By convention, the overall normalization of the fore- 
ground component is carried by the frequency dependence, 
not the spatial power spectrum. We include these sets one 
at a time and in pairs to investigate which is most im- 
portant. Considered singly, uncertainties in the shape of 
the power spectrum generally increase the error bars the 
most, although uncertainties in frequency coherence are 
more important for T/S in MAP. Taken together, uncer- 
tainties in the frequency dependence and coherence are 
important for T/S for both MAP and Planck. 

5.4.4. Dependence on foreground type 

We next consider the results when all foregrounds prop- 
erties are unknown save for those of a single component. 
This can identify the component about which external in- 
formation would have the most importance in improving 
cosmological inferences. The results are again shown in 
Table |[ For Boomerang, we find that uncertainties in the 
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Table 4 



Adding and Removing Knowledge of Foregrounds 





Boomerang 




MAP 






Planck 




Foreground Knowledge 


Q b h 2 (T) 


U b h 2 (T) 


n b h 2 (TP) 


t(TP) 


T/S(TP) 


Q b h 2 (T) 


Q b h 2 (TP) 


t(TP) 


T/S(TP) 


Known Properties 


1.000 


1.000 


l.UUU 


i nnn 
l.UUU 


1.000 


1.000 


i nnn 
l.UUU 


1 nnn 
l.UUU 


1.000 


Unknown Ci 


1.133 


1.162 


1.202 


1.320 


1.085 


1.014 


1.029 
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Unknown 


1.036 
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1.012 
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1.121 


Unknown R 
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1.376 
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1.014 
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1.290 


Unknown Ci & 


1.164 
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1.390 


1.545 


1.281 


1.018 


1.093 


1.107 


1.296 


Unknown, except: 




















Known Free-free 


1.098 


1.348 


1.616 


1.755 


1.414 


1.023 


1.132 


1.161 


1.619 


Known Synchrotron 


1.140 
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1.622 
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1.024 


1.117 


1.115 


1.411 


Known Vibrating Dust 


1.172 


1.208 


1.278 


1.266 


1.136 


1.024 


1.122 


1.133 


1.492 


Known Rotating Dust 


1.221 


1.320 


1.585 


1.656 


1.367 


1.024 


1.127 


1.140 


1.515 


Known Thermal SZ 


1.221 


1.385 


1.661 


1.756 


1.421 


1.022 


1.137 


1.163 


1.626 


Known Radio PS 


1.185 


1.238 


1.447 


1.691 


1.364 


1.007 


1.063 


1.143 


1.589 


Known Infrared PS 


1.209 


1.218 


1.411 


1.671 


1.348 


1.022 


1.107 


1.097 


1.535 


All Unknown 


1.221 


1.415 


1.674 


1.756 


1.424 


1.024 


1.137 


1.163 


1.626 



NOTES. — Errors on cosmological parameters as we alter the knowledge of the foreground modeL All numbers are listed relative to the results 
when the foreground properties are known; note that this differs from the convention of Table H. All results use the MID foreground model. 
A prior of 10 has been used on the SZ component except where stated. In the first half of the table, we progressively introduce each of the 
three different types of variations, applying them to all of the foreground components. The sets of foreground parameters for the frequency 
dependence (q^j), the frequency coherence (r£ .), and the shape of the spatial power spectrum ( s ^.j) are denoted by ©, R, and C( , respectively. 

Note that the normalization of the fluctuations is carried by the uncertainties. Higher errors indicate that the experiment's performance is 
particularly sensitive to those uncertainties of the foregrounds. In the second half of the table, we consider all the foreground properties to 
be unknown except for those of a given component. Lower errors indicate that external information on that foreground would be particularly 
valuable. 

foregrounds are not contributing much additional degra- 
dation beyond the mere presence of the foregrounds; the 
largest remaining concern is that free-free or synchrotron 
emission might have a high-frequency contribution. For 
MAP, improving knowledge of the vibrating dust has the 
most impact, on both the large-angle polarization signals 
and the small-angle acoustic features. Better control of 
point sources would help f^/i 2 from temperature infor- 
mation, but one should recall that our foreground model 
allows non-monotonic excursions in the power spectrum of 
the point sources and so may be overly pessimistic. Fur- 
ther, MAP suffers significant degradation unless the ther- 
mal SZ is controlled by an external prior of a factor of 
10, so robust calculations of the power spectrum of this 
effect as a function of cosmology will be required. For 
Planck, no foreground makes an enormous difference by 
itself, although radio point sources have the largest (but 
still small) effect. 

5.4.5. Dependence on polarization type 

For which polarization type would prior knowledge of 
the foreground properties most help cosmological param- 
eter estimation? The answer to this question depends on 
where the cosmological parameter information is coming 
from in the first place, and this in turn depends on the 
parameter in question. Limiting ourselves first to the case 
of known foreground properties, we can answer this ques- 
tion using equation (|47]). If foregrounds and/or system- 
atic errors made the measured power spectrum Cf com- 
pletely unusable, this would correspond to adding in in- 
finite amount of noise to the clement Mpp of the 4x4 



covariance matrix of equation (|45|). The Fisher matrix F^ 
of equation (^) therefore gets replaced by 



F' e = lim [F; 



■ Jt 



(63) 



where the J is a 4 x 4 matrix with zeroes everywhere except 
in element (P, P); Jp/p» = 5pp>8pp" . This corresponds to 
simply crossing out row P and column P of F^, inverting 
the remaining 3x3 matrix, and padding with zeroes. For 
example, if we drop the information from X-polarization, 
we obtain 
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where the notation is the same as in equation (|46|). Like- 
wise, omitting two of the four power spectra corresponds 
to crossing out two rows and columns before inverting, 
etc., so we can compute the attainable accuracy on cos- 
mological parameters using any subset of the four power 
spectra T E, B and X. 

Figure Ea shows the results for our sample of three cos- 
mological parameters using five such subsets. Comparing 
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Fig. 25. — Information about cosmological parameters from dif- 
ferent polarization types. In each panel, the bars show the relative 
error bars using all information (TXEB), all but B-polarization 
(TXE), T and X only (TX), T and E only (TE) and unpolarized 
intensity alone (T) . Results are shown both without any foregrounds 
(yellow) and for the MID foreground scenario with known properties 
that these color conventions differ from those in Figure 
and also that this figure uses a Gaussian rather than 
"poncntiaTcoherence function. The rightmost bars in the r panels 
extend far off the scale. 



T alone with the other cases illustrates the well-known 
fact that polarization helps substantially, especially with 
r and T/S (see e.g., Hogan et al. 1982; Bond & Efstathiou 
1987; Zaldarriaga et al. 1997). We find that all parame- 
ters that are sensitive to the acoustic peaks are like tl^h 2 
in that the bulk of the polarization gain is coming from 
X-polarization, manifested by T + X giving smaller error 
bars than T + E and by the combination T + X + E being 
only marginally better than T + X. On the other hand for 
r, E is seen to be more important than X for picking up 
the large-scale bump caused by early reionization. The B 
channel receives contributions from gravity waves alone. 
However it dominates the measurement of T/S only for 
Planck because MAP does not have enough signal-to-noise 
to yield interesting constraints on the B-polarization. 

These results imply that a better understanding of fore- 
ground polarization in X would most improve errors for 
flbh 2 , E for r and ultimately B for T/S. We also test 
these conclusions in the case of simultaneous estimation of 



foreground and cosmological parameters by placing priors 
separately on each of the polarization types; the results 
confirm these tendencies. 

6. CONCLUSIONS 

We have presented a comprehensive treatment of mi- 
crowave foregrounds and the manner in which they de- 
grade our ability to measure cosmological parameters with 
the CMB. Having developed three quantitative models, 
we compute their effect upon the Boomerang, MAP and 
Planck missions, including the level of foreground residu- 
als in the cleaned maps for various scenarios and the ex- 
tent to which this residual contamination would degrade 
the measurement of cosmological parameters. We con- 
sider both the case when the foreground power spectra are 
known and the case in which they must be computed from 
the CMB data itself. Our foreground model can be found 
at www.sns.ias.edu/~max/foregrounds.html together with 
software implementing our cleaning algorithm. 

Our results are generally encouraging, in that the exper- 
iments perform well in the face of rather severe foreground 
models. This success derives from the fact that the cosmic 
signals can be distinguished from foregrounds by their fre- 
quency dependence, their frequency coherence, and their 
spatial power spectra. With these handles on the cosmic 
signal, we find that the error bars on most cosmological 
parameters are degraded by less than a factor of two for 
our best-guess foreground model and by less than a factor 
of five in our most pessimistic scenario. Effects produc- 
ing large-angle polarization signals, namely reionization 
and tensor perturbations, suffer more because of their in- 
trinsically small cosmic amplitude, but even these can be 
accurately extracted in most cases. 

6.1. The most damaging foregrounds 

One useful result of this work is that it highlights which 
foregrounds are potentially most damaging for precision 
cosmology and therefore most in need of further study. 
We find that allowing for uncertainties in the properties 
of the foregrounds does cause a substantial degradation 
in performance relative to the case of known foreground 
properties. In the study of the acoustic peaks, these un- 
certainties were dominant; however, in the study of large- 
angle polarization, the mere presence of sample variance 
from the foregrounds was more important for Planck. 

Taken alone, the uncertainties in the shape of the power 
spectra were more important that the uncertainties in ei- 
ther the frequency dependence or the frequency coherence. 
In the case of tensors, the combination of frequency depen- 
dence and frequency coherence were particularly impor- 
tant. Of course, combinations of excursions are usually 
worse than the sum of individual excursions. 

In the case of MAP, adding external information about 
vibrating dust made the most improvement in the results. 
Point source information also helped in the temperature 
data on the acoustic peaks. Knowing the level of ther- 
mal SZ fluctuations from filaments to within a factor of 
10 a priori noticeable improved the results, so order-of- 
magnitude limits on this effect from simulations or ob- 
servations will be valuable. In the case of Boomerang, 
restricting the ability of free-free and synchrotron emis- 
sion to pollute the 90 GHz channel was most important. 
Clearly, MAP and Boomerang will complement each other 
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Fig. 26. — As Figure hsA but showing the relative degradation 
in error bars from MAPon four cosmological parameters as the 
amplitude of foregrounds are increased. The histograms show results 
for a series of foreground models based on our MID model, with 
amplitudes of all components multiplied by (i.e. no foregrounds), 
1, 3, and 10. The results are scaled to the no-foreground case; the 1— 
a errors in this case are listed in each panel, (lightly- shaded) Results 
with foregrounds of known properties, (heavily- shaded) Results with 
foregrounds with unknown parameters that must be simultaneously 
estimated from the CMB data. 



in their constraints on possible pathologies of the inten- 
sity of foregrounds, as the experiments cover relatively low 
and high frequencies, respectively. In the case of Planck, 
the foreground cleaning in the MID model was sufficiently 
good that most foregrounds had only a minor impact. The 
largest degradations were due to radio point sources and 
synchrotron radiation. 

Since we found that temperature-polarization cross- 
correlation carries much more information than _E-polarization 
on most cosmological parameters (the exceptions being r 
and T/S), it is clearly important to accurately model and 
measure the cross-correlation between polarized and E- 
unpolarized foregrounds. 

6.2. Robustness 

How robust are these results? Have we been to conser- 
vative or too optimistic? In general, we have tried to err 
on the side of caution, occasionally to the point of playing 
the role of the Devil's advocate. We view the MID model 
as slightly cautious and the PESS model as quite extreme, 
on the verge of being ruled out by current constraints. 
We have also been conservative in not taking advantage of 
foreground dependence on Galactic latitude except in the 
crudest way, with a Galactic cut. In the same spirit, we 
have not included information from non-CMB templates 
such as the DIRBE or Haslam maps. The formalism we 
have presented is general enough that both of these types 
of additional information can be included, the latter sim- 
ply by including the foreground templates as additional 
"channels" in the analysis. 



However, there are also ways in which real- world fore- 
grounds may be worse than we have assumed. We have 
made the simplifying assumption that each physical com- 
ponent is separable in i and v, i.e., that only the am- 
plitude (not the shape) of its power spectrum depends 
on frequency. This needs to be tested empirically, and 
may reveal that certain foregrounds decompose into sev- 
eral separable subcomponents. Perhaps most importantly, 
we have modeled foregrounds as Gaussian, which is cer- 
tainly incorrect at some level. Our removal method still 
succeeds in minimizing the rms residual even if the fore- 
grounds are non-Gaussian, and all our plots of residual 
power spectra remain correct (since they involve second 
moments only) , but the error bars on the measured power 
spectra (which involve fourth moments) that propagate 
into the calculations of cosmological parameter accuracy 
will change in this case, probably for the worse. As men- 
tioned, the variance of a measurement of say Ci will be 
2/N for the Gaussian case, where N = (2£ + l)/ s ky is 
the effective number of independent modes that probe this 
quantity. For a measurement of the band power between 
l\ and £ 2 , we have N = [(l 2 + l) 2 — t\]fsk y modes. Fore- 
ground non-Gaussianity typically correlates these modes, 
reducing the effective number of independent modes and 
thereby increasing the variance on the measured multipolc 
or band power. We explore the effect of such errors in a 
very crude way in Figure |2^, by simply increasing the fore- 
ground amplitudes by various factors An amplitude 
increase Q causes an increase in the power spectrum of Q 2 , 
corresponding to a variance increase of Q 4 and a reduction 
of N by Q 4 . For instance, increasing all foreground am- 
plitudes by a factor Q = 10 corresponds to reducing the 
number of independent modes by 10,000. This is likely 
to be more severe than the actual level of foreground non- 
Gaussianity, since it would imply, e.g., that all 10,000 mul- 
tipole modes ag m up to £ = 100 would be almost perfectly 
correlated. It should be noted that in the extreme case 
where non-Gaussianity gives perfect correlations between 
neighboring multipoles, foregrounds become trivial to re- 
move by projecting out an overall offset. In other words, 
the worst possible case lies somewhere in between the ex- 
tremes of no mode correlation and complete mode corre- 
lation. A detailed study of the non-Gaussian properties 
of foregrounds would certainly be worthwhile, using, e.g., 
the WOMBAT compilation of foreground data (Gawiser 
et al. 1999). 

6.3. Comparison with other work 

A number of excellent treatments of foregrounds and 
their impact on CMB measurements have been published. 
Thorough and recent ones that are particularly relevant to 
this paper are those done for the Planck proposal (TE96; 
Bouchet et al. 1996; Bersanelli et al. 1996; Bouchet et al. 
1998; AAO 1998; BG99) and K99. Although these studies 
did not compute the accuracies with which cosmological 
parameter could be measured, they all calculated resid- 
ual power spectra in the cleaned maps and their associ- 
ated error bars, which can be compared with ours. We 

11 It is easy to show that asymptotically, as Q — > oo and fore- 
grounds dominate over sample variance and detector noise, the pa- 
rameter error bars will scale as as Q 2 . Figure E6 shows that we are 
far from that limit, with a foreground increase giving a much weaker 
response. 
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typically find slightly higher levels of residual foreground. 
Apart from minor differences in the assumed foreground 
power spectra etc, this is because we do not assume that 
the foreground covariance between different frequencies is 
a matrix of rank 1 or 2. The former assumption (made 
in, e.g., TE96 and Bersanelli et al. 1996) corresponds to 
assuming Aa — 0, i.e., perfect frequency coherence. The 
latter, used in for instance BG99 and K99, is equivalent 
to assuming that each foreground can be decomposed into 
two perfectly coherent components. 

The elegant treatment of K99 gives foregrounds even 
more leeway than we have, with thousands of free param- 
eters, allowing their power spectra to be completely ar- 
bitrary functions of £ and fitting for them directly from 
the data. Unfortunately, this only works for the above- 
mentioned rank 2 assumption about coherence for Planck, 
since the number of components cannot exceed half of the 
number of channels. We have restricted the foreground 
power spectra, frequency spectra and frequency correla- 
tions to be fairly smooth functions, since all such functions 
measured to date have been fairly dull and featureless. 

6.4. Outlook 

A large number of papers have now painted a rosy pic- 
ture of the future of cosmology, with CMB experiments 
measuring cosmological parameters to unprecedented ac- 
curacy over the next decade. In this paper, we have tried 
quite hard to spoil this picture, using foreground models 
with hundreds of harmful parameters and pushing them 
to lim its of prrysieal plausibility and current constraints . 



Although wo T-ia.vo found that great carp npprls to hp takpn 

in the foreground removal phase of the data analysis to 
avoid potentially perilous pitfalls, we have failed to tarnish 
the overall picture with more than a few minor blemishes, 
degrading the accuracy on certain measurements by small 
factors. Although much work certainly remains to be done 
on the foreground problem, this is cause for cautious op- 
timism. 
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